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Economic Exchange during Hyperinflation 


Alessandra Casella 

University of California. Berkeley 


Jonathan S. Feinstein 

Stanford University 


Historical evidence indicates that hyperinflations can disrupt indi¬ 
viduals’ normal trading patterns and impede the orderly functioning 
of markets. To explore these issues, we construct a theoretical model 
of hyperinflation that focuses on individuals and their process of 
economic exchange. In our mode! buyers must carry cash while 
shopping, and some transactions take place in a decentralized setting 
in which buyer and seller negotiate over the terms of trade of an 
indivisible good. Since buyers face the constant threat of incoming 
younger (hence richer) customers, their bargaining position is weak¬ 
ened by inflation, allowing sellers to extract a higher real price. How¬ 
ever, we show that higher inflation also reduces buyers’ search, 
increasing sellers’ wait for customers. As a result, the volume 
of transactions concluded in the decentralized sector falls. At high 
enough rates of inflation, all agents suffer a welfare loss. 


I. Introduction 

Since Cagan (1956), economic analyses of hyperinflations have fo¬ 
cused attention on aggregate time-series data, especially exploding 
price levels and money stocks, lost output, and depreciating exchange 
rates. Work on the German hyperinflation of the twenties provides a 
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good example of this approach; see, among others, the econometric 
studies of money demand by Sargent and Wallace (1973), Frenkel 
(1977), Salemi and Sargent (1979), and Christiano (1987) and the 
tests for speculative price bubbles presented by Flood and Garber 
(1980), Burmeister and Wall (1982, 1987), and Casella (1989). 

If we compare these studies with historical accounts of the German 
experience, the difference in focus is striking. Whereas economists 
concentrate on aggregate performance measures, historians empha¬ 
size the hyperinflation’s disruptive impact on individuals and on their 
socioeconomic relationships. Previously stable trading connections 
were severed, transactions patterns were altered, and normally well¬ 
functioning markets collapsed (see Feldman 1977, 1989; Feldman et 
al. 1984; Kunz 1986; Moeller 1986). 

In this paper we present an attempt to bridge the gap between the 
historical literature and economic analysis. We construct a formal 
economic model of hyperinflation that centers on individual trading 
patterns and on the exchange process, and we explore the implica¬ 
tions of this model for relative prices, market structure, and social 
welfare. Our results are consistent with historical evidence, but our 
model naturally addresses only a small part of a larger set of issues. 
Our main purpose is to show, by example, the usefulness of reex¬ 
amining hyperinflations at a more detailed analytic level. 

Our approach depends on two main assumptions, both of which 
have been motivated by our reading of life experiences during the 
German inflation. First, we assume that domestic money is required 
for all transactions. Two pieces of evidence support this assumption: 
the legal restriction that prohibited the holding of foreign currencies 
and the general unwillingness to resort to barter up untilthe very last 
months prior to stabilization. 

Second, we construct our model so as to emphasize time and the 
importance of converting depreciating nominal accounts into real 
goods as quickly as possible. Many anecdotes reveal how important 
time became during the German inflation: 

Almost daily at the ten o’clock break I used to see the 
teachers trooping down into the [school’s] playground 
where their friends and relatives were waiting, into whose 
hands they thrust the money that they’d just received so 
that it could be spent before the prices went up. [Dorothy 
Haenkel in Guttmann and Meehan (1975, p. 80)] 

At eleven o’clock in the morning a siren sounded and 
everybody gathered in the factory forecourt where a five- 
ton lorry was drawn up loaded brimful with paper money. 
The chief cashier and his assistants climbed up on top. 
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They read out names and just threw out bundles of notes. 
As soon as you had caught one you made a dash for the 
nearest shop and bought just anything that was going. 
[Willy Derkow in Guttmann and Meehan (1975, pp. 57— 
58)] 

We incorporate these two assumptions into a simple model of aBy 
economy populated by overlapping generations of agents and consif# 
ing of two sectors labeled red and blue. We focus mainly on the fed 
sector, which is a decentralized market in which buyers must search 
for an available seller. When a buyer and seller meet, they bargain 
over the terms of trade of the indivisible red good, and we model this 
bargaining as a modified version of Rubinstein’s (1982) game. As we 
shall see, the buyer’s need to convert money into the red good as 
quickly as possible strongly affects the outcome of the bargain and has 
further repercussions on search behavior and red market structure. 
The blue sector, in contrast to the red, is centralized, consisting of a 
perfectly competitive market in which all sellers charge the same equi¬ 
librium price for the blue good. In accordance with our first assump¬ 
tion above, all transactions on both markets require cash. Finally, 
additional money is injected into the economy each period; hence 
nominal prices are rising on both markets. 

Our analysis of this model explores three issues, in each case lead¬ 
ing to results that seem consistent with available evidence. The first 
issue is the effect of inflation on relative prices and relative incomes. 
A buyer in the decentralized sector must carry cash while searching 
for an available seller and negotiating the terms of trade. Hence he 
faces the continual depreciation of his real money holdings and, most 
important, the pressing competition of new buyers, who enter the 
market with increasing amounts of nominal money. The buyer’s 
weakening position over time stands in sharp contrast to the stable 
position of the seller, who is able to hold his nondepreciating good 
until successfully completing a transaction. This asymmetry between 
the two agents translates into an increase in the bargaining power of 
the seller and a redistribution of the trading surplus in his favor: the 
relative price of the good sold in the decentralized sector and the 
seller’s real income are higher the higher is the economy’s inflation 
rate. This conclusion appears to capture, and rationalize, anecdotal 
evidence: “The frantic urge to buy, combined with the reluctance of 
producers and owners of goods to sell for depreciating marks, natu¬ 
rally drove prices up” (Guttmann and Meehan 1975, p. 29). 

The second issue we explore is the impact of inflation on agents’ 
welfare. In accordance with the partial equilibrium intuition dis¬ 
cussed above, we find that sellers in the decentralized sector benefit 
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from increased inflation over a wide range of inflation rates, while 
buyers suffer. Hence in our model there is no inflation rate that is 
unequivocally Pareto dominant, excluding lump-sum transfers. How¬ 
ever, we also find that at sufficiently high inflation rates, both buyers’ 
and sellers’ expected utilities fall. The intuition is simple: the buyers 
will continue their search as long as they have a positive probability of 
engaging in a successful transaction. But inflation is eroding their real 
balances, and eventually they might become so “poor” that no seller 
could be induced to consider them as potential customers. At this 
point, they would leave the market. At higher inflation rates the num¬ 
ber of periods allowed for search will be lower, and the average num¬ 
ber of buyers per store smaller. This will in turn negatively affect the 
sellers, who will have to wait longer before being contacted by a suit¬ 
able customer. In a sense, sellers suffer from a lack of coordination; 
each individually prefers not to talk to “older” buyers and disregards 
the impact of his decision on the group as a whole. Again, this result 
appears to coincide with available evidence: “Shops remained empty, 
and their suppliers, unable for this reason to get rid of their wares, 
reduced production” (Guttmann and Meehan 1975, p. 75). 

Finally, we calculate the velocity of circulation of money in the 
economy. At higher inflation rates, as buyers leave the decentralized 
market at earlier dates, the average time during which they hold 
currency is shortened, and the velocity of circulation is increased. Any 
new injection of money will then have a larger effect on the price 
level. This is, of course, the traditional assumption of hyperinflation 
models, here derived endogenously. 

Our results bear comparison with the large and diverse literature 
on inflation and, more specifically, with previous analyses studying 
the effect of inflation on trade, when money is required for transac¬ 
tions. Originating with Friedman (1969), this literature sees inflation 
as a distortionary tax on money balances. In a world in which lump¬ 
sum taxes are available, the optimal rale of nominal price increase 
equals the negative of the real interest rate: the cost of holding money 
is then equalized to the social cost of supplying money, that is, to zero. 
Bewley (1980) and Townsend (1980) prove formally that if agents are 
infinitely lived and have a common discount rate, the required price 
deflation entails an infinite money stock. Government intervention in 
the form of lump-sum taxes on money is necessary for Friedman’s 
result. I he initial intuition is applied to more complex models by 
Jovanovic (1982), Rotemberg (1984), and Romer (1986), among 
others, without altering the conclusion. Lucas and Stokey (1983) and 
Lucas (1986) enrich the framework to a world in which some goods 
are purchased with money and others with credit. Here inflation dis¬ 
torts society s allocation of resources away from the cash-in-advance 
sector by changing the relative cost of the two goods, and Friedman’s 
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rule is again optimal. In our model, all transactions require cash and 
Friedman’s result does not hold. However, the logic of these latter 
works does carry over to our economy, in a modified form: when 
economic sectors differ in their exchange technologies, inflation may 
affect them differently, distorting relative prices. The papers by 
Lucas and by Lucas and Stokey and our analysis are all examples of 
this general effect. 

Finally, the model presented here represents a natural extension of 
the work of Diamond (1982, 1984) and Diamond and Yellin (1984, 
1985), who study search and decentralized exchange. The important 
difference is the introduction of strategic bargaining as a way to de¬ 
termine how the surplus from trade will be shared. Given inflation 
and the cash-in-advance constraint, this feature endogenizes the max¬ 
imum number of periods allowed for search. It is then possible to 
follow, in a relatively simple way, the distribution of money holdings 
in the economy and therefore to completely characterize the general 
equilibrium solution, even in the presence of inflation. 

The paper is organized as follows. Section II presents the model. 
Then Sections III-VII analyze the simplest cases of the model. Sec¬ 
tion III explicitly derives the solution of the bargaining game for 
specific ranges of inflation rates that lead to a simplified analysis; 
Section IV describes in detail the matching process through which 
buyers and sellers meet; Section V derives the inflation rate, as a 
function of the increase in the money supply and the (endogenous) 
circulation of agents in the economy; and Sections VI and VII charac¬ 
terize the complete solution of the model. Finally, Section VIII ex¬ 
tends the previous results to the general case, and Section IX presents 
conclusions. 


II. The Model 

The economy is composed of two sectors. In the blue market a divis¬ 
ible good is sold competitively: this market is centralized, and equilib¬ 
rium prices are posted and adhered to by all traders. In contrast, in 
the red market the exchange of a storable, indivisible commodity is 
decentralized: individual buyers and sellers bargain over terms of 
trade prior to transaction. 

There is no production, and each period n* blue agents and n r red 
agents are born with an endowment of one unit of their respective 
good. Their utility functions are 

U r = 8 s q b , 

II = the red good has been purchased 

6 (8 5 qt, - A otherwise, 
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where 8 (< 1) is the common discount factor, q b is the consumption of 
the blue good, and s is the number of periods since birth when this 
consumption takes place. 

Everybody desires the blue good, while the red good is valued by 
only the blue consumers, who suffer a large disutility, A , if they are 
unable to purchase it. We assume that A is large enough to make 
them willing to pay any feasible price for the red commodity and to 
remain in the red market as long as there is any positive probability of 
a successful transaction. 1 

Agents exit the economy after consumption; hence the model is a 
modified overlapping generations framework in which the number of 
periods till death is endogenous. 

Transactions in the red market are characterized by a matching 
process, through which traders meet, and a bargaining game. We 
assume that identical red sellers sit in their stores waiting for blue 
buyers who choose randomly and independently where to shop. 
When customers enter a store, the seller decides which of them, if 
any, he wants to address and starts bargaining with him over the price 
of the red good. Since buyers choose the stores independently, each 
has a positive probability of not being alone and of not being selected 
by the seller. If this occurs, the neglected customer can either wait, 
look for another store, or leave the red market altogether. 2 

The bargaining is described by a modification of Rubinstein’s origi¬ 
nal game with complete information. The buyer and the red seller 
negotiate over the terms of trade of the red (indivisible) good. The 
negotiation proceeds in a series of offers and counteroffers; the seller 
quotes the first price, and the buyer can accept it or reject it. If he 
rejects it, he makes a counteroffer. Whenever the red seirer moves, he 
has three options: he can continue to bargain with the original buyer, 
choose a new customer (if a new customer shows up), or refuse to 
make his offer and end the current bargaining even if no agreement 
has been reached and no new customer has arrived.* 

1 A possible extension ot the model would be to see how the equilibrium solution 
reacts to changes in A. As the probability of engaging in a successful transaction in the 
red market becomes small, for certain values of A it might be optimal for a blue 
consumer, ceteris paribus, to avoid the effort of trying to purchase the red good. As 
this is true for all—identical—blue agents, the probability of success in the red market 
goes to one. Hence, equilibrium requires a mixed strategy. 

s We might imagine that if more than one buyer arrived at his store, the seller could 
hold an auction, selling the good for the highest price. If the auction is sealed-bid, he 
will extract (i) all the income of one member of the youngest generation of buyers 
present if more than one such buyer arrives or (ii) a price equal to the income of the 
second-youngest buyer. Our bargaining model continues to apply when only one buyer 
shows up (which occurs with positive probability), with an appropriate modification of 
the expected off-equilibrium price. 

Both these possibilities are available but irrelevant in Rubinstein's original game 
with identical players, in our model, inflation creates differences between the buyers 
and therefore lends importance to these otherwise negligible moves. 
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It is assumed that goods can be bought only with money. There is 
no credit system, and a cash-in-advance constraint forces consumers 
to hold their nominal balances while shopping. In fact, agents’ endow¬ 
ments are such that a nonmonetary barter equilibrium could be sup¬ 
ported; hence we must assume the transactions role of money, rather 
than derive it endogenously as in classic overlapping generations 
models. 

A source of money, and inflation, is a supposedly benign govern¬ 
ment agency, called “the bank,” whose sole activity is to multiply the 
money holdings of its clients and therefore the money circulating in 
the economy. For simplicity, we assume that the blue consumers go to 
the bank on the way to the red market, but any other configuration 
(i.e., both red and blue, or only the red, go to the bank) would leave 
the substance of our analysis unaffected. Indeed, the same results 
could be obtained by having the government directly buying blue 
goods with newly printed money or by making any assumption that 
would capture the progressive increase in the nominal money hold¬ 
ings of successive generations. The nominal price on the blue market, 
p ht , goes up steadily because the same total endowment is traded each 
period for an always increasing money stock. 

In equilibrium, the life of a blue consumer is described by the 
following succession of moves. At time t, he is born with one unit of 
the blue good, which he immediately sells on the competitive market, 
earning the nominal price p hl . At time l + 1, he goes to the red market 
and, on the way, stops at the bank, where his money holdings are 
increased by a factor of a 0 - With a n p b , in cash, he enters a red seller’s 
store and, if chosen, starts bargaining. At time t + 2, if the bargaining 
has been successful, he returns to the blue market, where he spends 
all his remaining money. If he was not chosen by the red seller in 
period t + 1, he goes to another red store. The blue’s search for a 
willing seller continues until either he succeeds in buying the red 
good or his money holdings have so depreciated that no seller would 
trade with him. In all cases, his final move is to spend all the money he 
has left in the blue market. 

As for the red consumer, at time t he is born with one indivisible 
red good and stays in his store waiting for a customer. If a single 
buyer enters, the seller makes him an offer, unless he considers him 
too poor. If more than one buyer enters the store, he addresses the 
one who can afford to spend the most or, if all are identical, he 
chooses one randomly (unless all are too poor). At time t + 1, if he 
made an offer to a buyer at time t, the transaction is completed (in 
equilibrium) and he goes to the blue market to spend the proceeds. 
Otherwise he again waits for a suitable customer: he has to remain in 
his store as long as needed to sell his good and earn the money that he 
will finally spend in the blue market. 
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As we shall see, the economy is completely characterized by three 
fundamental parameters: 8, the discount rate; n/,ln r , the ratio of blue 
to red consumers born each period; and oto, the multiplicative factor 
describing the intervention of the bank. In the following sections, we 
compute steady-state equilibria, as functions of these parameters, and 
compare relative prices and expected utilities. 

III. The Bargaining Game 

With inflation eroding the real value of his money holdings, a blue 
buyer cannot continue to search indefinitely for a well-disposed seller 
in the red market: if he has been unsuccessful (has not been chosen by 
any of the sellers he has approached) for too long, his real wealth 
might have become so low that any seller would rather wait for future 
potential customers than bargain with him, even if he is the only cus¬ 
tomer in the store. 

Of course, the critical number of periods allowed for search de¬ 
pends on the inflation rate, and we can define different “regimes" as 
ranges of inflation rates corresponding to a specific number of pe¬ 
riods that can at most be devoted to search. Within each of these 
regimes, the red seller is willing to bargain only with customers whose 
“age” in the market is below that critical threshold. 1 

To keep the exposition simple and the equations tractable, we start 
by concentrating on a specific example: a range of inflation rates such 
that the red sellers are willing to bargain at most with buyers who have 
just arrived in the market (young) or have gone through a single 
unlucky search (old). We call this the two-generation regime. We then 
explicitly derive the equilibrium results and study the transition to the 
one-generation regime, with its welfare implications. The final section 
of the paper presents the solution for the general case. 


Bargaining Equilibrium: The Two-Generation Regime 

I he bargaining game is complicated by the presence in the market of 
buyers of different ages and by the uncertainty created by the match¬ 
ing process. When a buyer enters a store, neither he nor the seller 
knows whether any new customer, young or old, will arrive in the 
near future. In addition, the original buyer himself can be either 
young or old. 

The simplifying feature of the game is its preserved stationarity: 
nothing is modified over time, and the value of bargaining with a 

In the model we discuss, the length of the time unit, the bargaining period, is 
exogenous and fixed A nontrivial improvement would be to have it determined endog¬ 
enously. In that case, it would clearly be affected by the inflation rate, with seller and 
buyer presumably having contrasting objectives 



ECONOMIC EXCHANGE 9 

specific buyer (young or old) is the same in any period. It is this 
fundamental property that characterizes the game. 

A few observations will help us derive the solution. 

1. Since the seller is not willing to bargain with buyers who have 
been in the market more than two periods (very old), he will always 
stop negotiating with the original customer if no agreement has been 
reached before period 3. However, he has to give the customer the 
right to a counteroffer in period 2, regardless of his age or of the 
possible flow of competing buyers. 

2. In period 3, the seller will switch to a young buyer, if one enters 
the store, or to an old buyer, if at least one arrives and no young 
buyers are present at the time (see the proof in Casella and Feinstein 
[1987]). If nobody enters the store, the seller will wait. Therefore, the 
ex ante value of the game to the seller in period 3 in the two- 
generation regime, x 2 , is 

X 2 = Vyipy'j + V„ 2 p„ 2 + V„ 2 dx 2 , ( 1 ) 

where v, 2 is the probability of addressing a customer of type i (t = y, 6) 
in the two-generation regime, v„ 2 is the probability of having no new 
customers, and p, 2 is the real price of the good when sold to a cus¬ 
tomer of type 2 . 

3. Let us define as /, the real money balances of a buyer of type i in 
period 1 and as -ir the rate of nominal price increase, per period, in 
the blue market (which will later be determined endogenously). When 
we exploit stationarity, the sequence of alternating offers that would 
take place off equilibrium, and that determines the equilibrium price, 
can be analyzed in the elegant diagram proposed by Sutton (1986). 


Period 

Offer Sequence 

Seller’s Share 

Buyer’s Share 

1 

Seller io buyer 

min{/„ /, - - Bxa)} 

max{8[(/,/a) — Sx 2 ], 0} 

2 

Buyer lo seller 

bx 2 

max[(/,/a) - Bx 2 , 0] 

3 

Seller lo buyer 

X* 



Since the value of the game to the seller in period 3 is x 2 , if no 
agreement had been reached in the first period, the buyer, moving at 
time 2, should offer at least 8x 2 , keeping for himself (/,/a) — 8x 2 , if 
positive, or zero (where a s= 1 + it). Anticipating this, the seller in 
period 1 will quote as a price I, - 5[(/,/a) - 8x 2 ], if less than /„ or /„ 
leaving to the buyer 8[(/,/a) — Sx 2 ] or zero. This characterizes a per¬ 
fect equilibrium in which trade takes place in the first period and 

p l2 = minj/„ 7,^1 - + 8 2 x 2 j, 

where x 2 is defined by equation (1) and a * 1 + u, 


( 2 ) 
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Equations (2) and (1) establish that this price is nondecreasing in in¬ 
flation. As expected, the combination of inflation with a cash-in- 
advance constraint weakens the bargaining position of the buyer. In 
addition, since we know that I y = aJ„ (by the definition of young and 
old), equation ( 2 ) implies that the young buyers always pay a higher 
price than the old, justifying the preference of the seller, but a smaller 
share of their wealth. As expected, successful search has a positive 
payoff. 

The condition that restricts the seller to bargaining with only two 
generations of buyers determines the range of inflation rates underly¬ 
ing the equilibrium described by equations ( 1 ) and ( 2 ). It must be true 
that if every other store sells to young and old customers, it is not 
profitable for a single seller to deviate and either expand his trade to 
the very old or restrict it to only the young. This implies 

pvo 2 - 8*2. (3) 

Pn 2 ^ SXl, (4) 

where 8 x 2 is. as before, the value to the seller of waiting another 
period, when his only customer is a very old buyer and the seller 
contemplates bargaining with young and old only. Similarly, 8 x! is the 
value of waiting when the one available buyer is old and the seller has 
the alternative of addressing only young customers. 

The very old buyer would rather exchange his whole money hold¬ 
ings for the red good than abandon the hope of purchasing it and 
incur a large disutility. Therefore, the two-generation regime can be 
an equilibrium only if 


ho s Sx 2 , (5) 

with equation (5) holding with equality at the inflation rate at which 
the transition to the two-generation regime takes place. Recalling that 
lvo = I,Ja and 


po 2 = min 

d 1 

- -) + 8 2 x<; 


L \ 

«/ j 


and substituting (5) in (2'), we conclude that 


Po 2 = l„- (6) 

The lower bound of the range of inflation rates for which the two- 
generation regime is an equilibrium is exactly the rate at which, in the 
present regime, the seller will extract the whole surplus from the old 
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buyers (it ,,). 5 Taking this into account, we can obtain the equation 
defining ir„ from (1) and (2): tt 0 solves: 


u, 2 (l + 1 O 2 + (1 + tt 0 )(v o2 - bv y2 ) 

r 1 _ c-i 0 ) 

- |u „ 2 + v y2 (l - S) + - g - J = 0 . 

Equation (4) will give us the upper bound of the relevant range of 
inflation rates. When one seller deviates and trades with young cus¬ 
tomers only, the ex ante value of the bargaining game from his point 
of view is 

X, = Vyipy 1 + (1 - v 7 s)8xi, (8) 


and, when we repeat once more the usual logical steps, the price set to 
the young buyers is 


Py 1 


min 





+ 8 2 x, 


(9) 


which implies 


x, = mm 


A 


Vy2 


r. A 


%[■ ~ ( 8/ «)l 


1 - 8(1 - v y2 ) 1 - 8(1 - v y2 ) - 8% 2 
Substituting (10) and (6) in (4), we get 


( 10 ) 


TT < 


1 - 8 

hV y2 


(H) 


Equation (11) defines the highest inflation rate consistent with the 
two-generation equilibrium. (Substitution of [11] in [7] confirms that 
it is above tt„.) Note that this rate is also exactly equal to tt v , the infla¬ 
tion at which the young buyers pay their whole monetary wealth for 
the red good. In fact, equations (6), (1), and (2) imply 


p y2 = min A> A 


(1 - 8u„ 2 )[I - (6/a)] + (6^/a)v o2 


( 12 ) 


I - 8iv 2 - 8 2 w, 2 

and simple differentiation shows that the share of wealth that the 


' Note that the result does not depend on the specific utility function we assume. In 
general, the expected value of the game to the seller in period 3 will be O', and the 
buyer will need to offer him 81/' in period 2. But the optimality of the two-genera¬ 
tion regime requires = IJa s hV' and the old buyer in period 2 will be left with 
max[0, (/„/a) - SC/'J = 0. The result is more general: in any regime, the price the sell¬ 
ers quote to the oldest generation of buyers with whom they trade must equal their 
whole monetary wealth. 
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seller receives increases with inflation, reaching one when inflation 
reaches 


TTy = 


1 - 8 6 

bVy'2 


(13) 


In summary, the two-generation regime is an equilibrium if the in¬ 
flation rate is above the rate at which the old buyers trade their entire 
money holdings for the red good (it„) but below the rate at which this 
is true for the young buyers. Equations (7) and (13) define tt 0 and it r 
Within this regime, there is a perfect equilibrium such that the red 
good is sold in the first period of negotiations, and its real price 
depends on the type of buyer being addressed by the seller. 
Specifically, 

po 2 = /„ (6) 


and 


Py 2 


/ (1 ~ 8ty,>)(l ~ (SMI + (6 a /a)t)„2 

1 - 8tVi - 8 2 ?’ v2 


(14) 


Bargaining Equilibrium: The One-Generation Regime 

The derivation of the equilibrium price when it is optimal for the 
seller to address only young customers follows exactly the methodol¬ 
ogy described above. 

The price, as already mentioned, is given by equation (9): 

(9) 

where 

*1 = Py X + (1 - Vy\)hX\. (S') 

Notice, however, that the probability that at least one young customer 
will enter the store is now v yi : since the regime has changed, so has the 
velocity of circulation of buyers in the market and therefore the prob¬ 
ability laws describing meetings between buyers and sellers. We derive 
these probabilities below. 

For equations (9) and (S') to characterize an equilibrium, we re- 



Another line ol reasoning that leads to the same conclusion is that at the inflation 
*ate at which the seller is indifferent between addressing both young and old buyers 
and only young buyers, = /„ = 8*, and*, = xy. Substituting these two conditions in 
eq. (2) lor t = y, we obtain p vl = /, at the critical rate. 
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quire that, given that everybody is selling only to the young, there be 
no incentive to deviate: 



po 1 S &*]• 

(15) 

Again, the old buyers would be willing to trade all their money for the 
red good, implying 


h s Sx,. 

(16) 

Substitution of (16) in 

(9) leads to 



II 

(17) 

But equations (9) and (8') can be solved explicitly: 


p y] = min 

1 

uO 

1 

.a 

cO 

1 

K. 

»«• 

(18) 

[ V 1 — 8(1 — Uyl) — 8*1^! J 

and therefore equation (17) is equivalent to 



1—8 

Tr “ Sim • 

(19) 


The conclusion is that it is optimal for the sellers to trade with only 
young customers whenever condition (19) is satisfied. For such infla¬ 
tion rates, the young buyers will have to exchange their whole wealth 
for the red good. 

If the probabilities of meeting different types of customers did not 
change across regimes, equations (11) and (19) would define a specific 
inflation rate at which the sellers stop considering the old buyers as 
suitable bargaining partners. Since this is not true, the probabilities 
have to be solved explicitly before anything can be said about the 
transition from one regime to the next. In addition, a complete solu¬ 
tion of the model needs to reconcile the inflation rate with its funda¬ 
mental cause, the intervention of the bank. 

IV. The Matching Process 7 

Let us call 0 O the ratio of blue to red consumers born each period, 
0o = n k !n r , where we assume that n h and n T are large. Let N r be the 
number of red sellers that are active in the market each period, and 
notice that N r exceeds n r whenever some red sellers have been unable 
to complete a transaction in the past. 

In the one-generation regime, the only buyers searching in the red 


7 A similar matching technology, embedding the Rubinstein game in a multiagent 
market, has been studied by Binmore and Herrero (1988). It was first proposed by 
Butters (1977) in a different context. 
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market are the young; hence exactly n b buyers are active. Under the 
assumption that each buyer chooses which seller to visit indepen¬ 
dently of the other buyers, the probability that a particular seller is 
visited by no customers is [(N r - 1 )/W r ]”‘, which represents the 
chances that all n b buyers go to one of the other N r — 1 stores. We may 
rewrite this as 

(>' ~k)~ - “K"* ll>g ('" tB] ' *"■ 

where 0 is defined to be n b lN r , and the approximation is valid to order 
(1 IN,).* Therefore, the probability that a red seller meets a suitable 
buyer is 

w,i = 1 - (20) 

In a steady state, the number of red sellers born each period must 
just balance the number of sellers who successfully complete a trans¬ 
action and exit: 

n T = (1 - e~»)N r 9 (21) 

Dividing equation (21) by n h and rearranging, we obtain 

0 = (1 - e- 9 )6„. (22) 

To guarantee the existence of a steady state, we require 0 O > 1: the 
number of blue consumers being born must be larger than the num¬ 
ber of red consumers. In any period, the blue buyers leave the red 
market for one of two reasons: either they have concluded their trans¬ 
action, in which case their exit is matched by an equal number of 
departing sellers, or they have by now become “too poor.” T his sec¬ 
ond motivation has no parallel for the sellers, and therefore, for any 
positive inflation rate, they always leave the red market in smaller 
numbers than the blue buyers. If the cohorts being born were of 
equal size, the red market would eventually disappear. 

In the two-generation regime, at any time there are two types of 
buyers searching in the red market: young and old. Each seller pre¬ 
fers to trade with the young customers, and the probability of meeting 
at least one of them is 

i\ 2 = 1 - (23) 

where 0, * n h /N, and replaces the earlier 0, while v y2 replaces the 

* See Feller (1972, pp. 88-92) or David and Barton (1962, chap. 14). Since N, and JV 4 
are large, the fraction of sellers without a buyer is the same (to order 1 /N r ) each period, 
even though the fate of any particular seller remains uncertain. 

9 Notice that eq. (21) is derived under the equilibrium condition that all buyer-seller 
pairs conclude their bargain successfully in the first period and exit the market. 
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earlier v y \-, N r continues to denote the number of red sellers active in 
the market. 

The probability that a seller does not meet any young but meets at 
least one old customer is 

v o2 = *~ 6| (1 - (24) 

where 0 2 is the number of old buyers in the market divided by N r , and 
the number of old buyers in the market, which we denote by Nt,o> 
equals the number of previously unsuccessful young: 

N hn = n„ - (1 - e~«')N r . (25) 

Dividing (25) by N r , we get 

0 2 = 0, - 1 + c“ 6 '. (26) 

The steady-state flow condition in the two-generation regime is 

n r -N r - N r e- ei f-*', (27) 

which says that the number of red sellers born equals the number who 
meet a suitable customer and exit the market (notice that N,e~*‘f~ 9 ' is 
the number of unsuccessful sellers). Dividing equation (27) by n h , 
substituting for 0 2 from equation (26), and rearranging yields 

0, = 0 O - Oof 1 " 29 ’ exp(-f~ 91 ). (28) 

The existence of a steady state again requires 0o > 1 (rt* > n r ). 

Comparing equations (22) and (28), for given 0 O , we find that 0| > 0 
or, substituting this result in equations (20) and (23), u v i < n v2 . For a 
seller, the probability of trading with a young buyer is higher in the 
two- than in the one-generation regime. Intuitively, at lower inflation 
rates, sellers are less “difficult” in their choice of bargaining partners, 
and therefore more sellers leave the market each period, leading to a 
higher ratio of young buyers to sellers. 


V. The Rate of Price Inflation 

Each period, the bank in our economy creates inflation by printing 
new money or, more specifically, by multiplying by a 0 the money 
holdings of the blue consumers who are going to the red market. 

The rate of price increase cannot in general be a 0 since the latter is 
the factor by which only a fraction of the money in the economy is 
multiplied. How large this fraction is, and therefore how large an 
effect the bank exerts on the price level, clearly depends on the veloc¬ 
ity at which the consumers, and the money, circulate in the two mar¬ 
kets. The implication is that the relationship between a 0 and the infla¬ 
tion rate will be different in the different regimes. 
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The nominal price of the blue good is derived by the condition 
stating that the total revenues from its sale must equal the total 
amount of money spent in the blue market. 

Consider, first, the case in which blue buyers are allowed only one 
period of search: the one-generation regime. In period t + 2, success¬ 
ful blue consumers born in period t return to the blue market, along 
with the red sellers with whom they have traded; net money holdings 
of these two groups are (a 0 p*,)(number of successful blue). Recall that 
Vyi is the fraction of red sellers who successfully match with blue 
buyers and that 0 is the ratio of blue born (and searching) to red. 
Then the number of successful blue is just (i\i/9)n/,. In addition to 
these two groups, unsuccessful blue agents born in period l also re¬ 
turn to the blue market at time t + 2; their net money holdings 
are (ao/>/,,)(number of unsuccessful blue), which equals (a 0 /?*,)[l — 
(Uyt/OJJn*. No other buyers shop in the blue market in period t + 2; 
hence the equilibrium condition equating nominal supply and de¬ 
mand is 


nhpbt + v - aoph 



»/, + a 0 pJ 1 



Simplifying, we get = a () p hl or 

a 2 = a 0 . (29) 

Equation (29) could have been derived directly by noticing that in this 
regime, for each dollar that leaves the blue market at I, a n must come 
back at t + 2, regardless of how many transactions have been success¬ 
ful in the red market. 

In the two-generation regime, the procedure is identical. The 
buyers of the blue good at time t + 2 are the blue consumers who, 
born at t , have been successful in their first search, and their red 
partners; the blue consumers who, born at t - 1, have been successful 
in their second search, and their red partners; and the unsuccessful 
blue agents born at t — 1. The sum of their money holdings is given 
by 


oLupmi—y,, + oto pht-\(jj^jn h + ,^I - 

which leads to the equilibrium condition 
+ 2 

or 



( 30 ) 
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We conclude this discussion with two observations. First, nominal 
prices in the red market also rise at the rate a, as can be seen by 
comparing buyers in periods t and t + 1: young buyers in t + 1 hold a 
times as much money as young in t, second-generation buyers in t + 1 
hold a times what second-generation buyers hold in t, and so forth. 

Second, as inflation rises and search is reduced, the velocity of 
circulation of money increases since blue buyers wait less time before 
spending their cash balances. It can be shown that a given percentage 
increase in the money holdings of the blue consumers (ao) has a larger 
effect on inflation in the one-generation regime than it does in the 
two-generation case. (As will become clear below, this property holds 
generally, for any \k - 1]- and ^-generation regimes.) Note that this 
result, one of the classical assumptions of all hyperinflation models, is 
here derived endogenously, as transactions respond to inflation. Of 
course, within each regime, the velocity of circulation of money is 
constant, as is the amount of real money in the economy. 


VI. Transition from the Two-Generation to the 
One-Generation Regime 

For given 8 and 0 O , the levels of inflation at which the two-generation 
regime ends and the one-generation regime begins are given by equa¬ 
tions (11) and (19): 


8j; v2 ’ 8v v i 

Recalling that a s 1 + it, we let a 1 ' 2 and ! denote the points at 
which these two inequalities become equalities; thus a 1 ' 2 is the upper 
edge of the two-generation case and a ,A the lower edge of the one- 
generation case. Solution of the matching process has demonstrated 
that v y2 exceeds v yl , which implies that a L2 is less than a 'The situa¬ 
tion is further complicated, however, by the fact that a depends on ao 
differently in the two regimes. Letting a| and a 2 represent a in the 
one- and two-generation regimes, we have seen that a 1 exceeds a 2 for 
any given a 0 and that a 2 depends on 0 O , whereas ai does not. 

Figure 1 depicts the relationship between a 0 and a across the two 
regimes. When 0 O is near one, v y2 is significantly above v yl ; as a result, 
the “probability effect’’ dominates, and the a 0 at which a 1 " 2 is realized 
does fall short of the a 0 at which a L1 is realized. W'hen 0 O is large, 
however, the seller’s chances of meeting a young buyer are high and 
do not diminish much between the two regimes; hence v y] and v y2 are 
similar, and the “velocity effect” that relates a to a 0 dominates, push¬ 
ing the a 0 corresponding to a u2 above the a () corresponding to a Ll . 

For low 0o, a “mixed-strategy” equilibrium will emerge from a 0 
between point A and point B. In this region a fraction fl of sellers talk 
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Fic. I.—Transition between the two- and one-generation regimes 


to the young buyers only, and a fraction 1 - ft to both generations. 
When a 0 is at the low end of this region, ft is near zero, and it 
increases steadily to one at the upper edge. For each ot 0 , ft is such that 
expected utilities to the two types of sellers are equal. 10 

For high 0o, a “multiple-equilibria” region emerges. For oto between 
point B and point C, there are two possibilities: either all sellers talk to 
both generations of buyers or all sellers talk to only one generation. 
Both possibilities are equilibria. If all other sellers talk to both genera¬ 
tions, a becomes ag, which is less than a C i! ; hence it is a dominant 
strategy for a particular seller to talk to both generations as well. If all 
others talk only to the young, then a becomes a )t which exceeds a Ll , 
so that a dominant strategy is to talk to only the young. 

Presumably some 0* exists at which a uz equals a 11 . We have not 


10 For each a 0l within the mixed-strategy regime, the fraction fl of sellers who talk to 
both generations is found as the fixed point of the following mappings: each fl maps 
into a seller's probabilities of meeting the young or old buyers; in turn these probabili¬ 
ties map into a, prices and expected utilities. If the expected utility of those sellers who 
talk only to young is, e.g., above the expected utility of those who talk to both genera¬ 
tions, fl is lowered, but this lowers the probability of meeting a young customer while it 
raises the probability of meeting an old, causing a readjustment in expected utilities. 
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sought to determine this 0*. but instead have simulated four different 
values of 0 O : 1.01, 1.10, 2.00, and 10.0. Of these, 1.01 and l.lOpossess 
mixed-strategy regions, and 2.00 and 10.0 multiple equilibria. 


VII. Expected Utility in the One* and 
Two-Generation Regimes 

We complete our analysis of the one- and two-generation regimes by 
calculating the expected utility of the red and blue consumers. 

Let us first examine the expected utility at birth of a red seller. 
Within the two-generation regime, it can be seen directly that it in¬ 
creases with a: 


EUrii = —{(pygVyS + po 2^o2)[l +8(1 - v y2 ~ 


+ 8 2 (1 - Vy2 ~ V o2 f + . . .]}■ 


The red seller will trade immediately if either a young or an old buyer 
enters his store, and it will then take him one period to convert his 
earnings into units of blue good. If he does not trade in the first 
period—with probability (1 - v y2 - IV 2 )—he will have to wait longer; 
longer still if he is again unlucky in the second period—with probabil¬ 
ity (1 — v y2 — v n2 )‘ 2 —and so forth. The expected utility converges to 


= (6/ CI.)(py 2 Vy2 + p„ 2 U„ 2 ) 

Tl 1 - 8(1 - Vy 2 - v o2 y 


(31) 


which increases in the bargaining prices. 

Similarly, in the mixed-strategy (if one exists) and one-generation 
regime, the red expected utility equals 


(8/ot)v y i(a 0 /a) 
~ 1 - 8(1 - v,,)’ 


(32) 


where it should be recalled that p y] equals I y . In the one-generation 
regime, EU r \ is constant and simplifies to 1 / [ 1 - 8(1 — w vl )]sinceao 
equals a 2 . 

When the two regimes are compared, the two factors working in 
opposite directions are easily identified. On one hand, p y is higher in 
the one-generation regime, and this tends to increase EV r \ relative to 
EU t2 . On the other, the probability of successful trade in any period is 
lower since old buyers are disregarded and the probability of meeting 
a young blue customer is reduced. This tends to reduce EU r \. 

To explore the question of red utility further, we have simulated 
the economy at the 0 O values of 1.01, 1.10, 2.00, and 10.0, as men- 
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tioncd above . 11 Figure 2 illustrates our findings for the representative 
case 80 equal to 1.10 and the three 5 values of .90, .95, and .99. In all 
three cases, the red seller’s expected utility is always lower in the one- 
generation regime. Connecting the two regimes is a region of mixed 
strategy, in which the utility declines as the probability v y \ falls. 

The result was confirmed in the simulations with the other three 8 o 
values, since they cover a very wide range, we would suggest that the 
result is general. For the lower 0 O values, in which the two regimes are 
connected by a region of mixed strategy, we conclude that, within this 
region, red expected utility falls with ot 0 . For higher 6 ,,, the two re¬ 
gimes overlap in a region of multiple equilibria, and utility depends 
on which equilibrium emerges. The two-generation equilibrium dom¬ 
inates. 

The expected utility of the blue consumers reflects the same ambi¬ 
guity. The intuition is simple: as the ratio of young buyers to sellers in 
the market decreases, moving from the two-generation to the one- 
generation regime, the probability of purchasing the red good during 
the first search period rises. This positive effect is, of course, the 
mirror image of the lower probability of successful trade for the red 
seller. However, of fsetting this is the fact that the real price of the red 
good becomes higher with inflation. 

For a blue buyer, the ptobability of being chosen as a bargaining 


" 1 <> provide a reference framewor k, we <an calculate the utility of ihe agents in the 
case <>l /eto inflation 1 he hi si observation is that, in this case, both buyers and sellers 
leave the niarkn only alter a successful transaction, which implies the steady-stale 
conditions »s/n, - 1 and j\V.V, = 1 l or a seller, the probability of having at least one 
buvet in the hrst period is (I - t '), of having at least one buyer tit the second .period 
and none in the first is r '(I - t ’), etc. In addition, all buyers in the market are 
identical, and when two bargaining partners meet, they share the real money holdings 
ot the Imyct according to the partition [ 1/(1 + 6), 8/(1 + 8)) (the Rubinstein solution) 
It is then straightforward to derive 


El', = 


*>(l - r ') 

(I + 8)(l - 8e ') 


f 448 if 8 = .0 

473 it 8 = .95 

1.495 if 8 = .99, 


El\ 


(1 + 6)(! - he ’) 


f .363 if 8 = .9 
A27 if 6 = 95 
( 485 if 6 = .99. 


A « eXf *T a ' 7rr ° " ,Hat,on the d'Herences m the utilities of buyers and sellers simply 
reflect the imposed, and Uninteresting, differences in the order and timing of their 
moves, and disappear as the discount rate goes to one (El/, = EU, —> .5 as 8—» 1). The 
extra period required bv blue consumers to complete their life cycle and the seller s 
advantage of quottng the first prtce tn the bargaining game account tor the 8* term. 
I hts result requires e„ - 1 and therefore is not immediately comparable ro the values 
obtained in the simulations (since the presence of inflation requires a higher 6„). In 
general.incteasmg 8„ favor* the sellers and hurts the buyers, by its ef fect on matching 
probabilities. ’ B 
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Fig. 2.—Expected utilities across the one- and two-generation regimes 


partner is easily derived: it must equal the number of successful 
buyers of one’s generation (identical to the number of red sellers 
meeting at least one customer of the given age), divided by the overall 
number of buyers of that generation in the market. That is, the prob¬ 
ability of being chosen while belonging to the ith generation in the 
^-generation regime is u,*/0,. We can immediately verify that while 
v y \ < v y 2 , Vy|/0 > v y2 /e ] . 

Substituting these probabilities in the blue utility function, we get 
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EV„ - (l - - /«. < 3 « 

Let us call A* the minimum value of A that supports the described 
pure strategy equilibria for the two- and one-generation cases, where 
a blue agent is allowed the alternative to consume only on the blue 
market, immediately and without going to the bank (U = 1 - A). 
Simulation results (for 80 = L10) indicate that with A > A and any 8 , 
the expected utility of the blue consumers falls across regimes as in¬ 
flation increases if A is sufficiently large (i.e., A > 2). In addition, 
the higher is 8 , the closer the critical A is to A*, converging to A* for 
8 = .99, the most relevant case. In other words, if the punishment 
from not purchasing the red good is high enough, the probability 
effect does not compensate the blue buyers for the loss in bargaining 
power deriving from the higher inflation rate. The final result is an 
overall welfare loss. 


VIII. Solution of the General Case 

While we have devoted most of our attention to a detailed exposition 
of the one- and two-generation regimes, our analysis extends to the 
general case of k generations of buyers. Presumably as a () falls toward 
one, blue buyers are willing to search more periods in the red market. 
We would expect the price paid by the youngest buyers in the market, 
p I*, to fall as ot ( ) falls within a regime and also to fall as k rises. In fact, 
in the limit as a ( , approaches one and k becomes very large, p lk should 
approach the Rubinstein solution. More uncertainty attaches to the 
behavior of expected utility. For large enough A, we expect, blue 
buyers to become better off as a 0 falls and k rises, but the fate of the 
red sellers is unclear: lower prices as a () falls will diminish their utility 
unthin a regime, but higher probabilities of meeting eligible buyers 
may increase their utility across regimes, as was found to be true in a 
comparison oi the one- and two-generation regimes. 

To derive the equilibrium bargaining prices, we carry over the 
earlier analysis, with one important modification. In the one- and two- 
generation cases, a seller who fails to reach agreement with a buyer by 
the third period will prefer to wait rather than continue talking to that 
buyer. By contrast, in the A-generation case, a seller who has matched 
wit a buyer of generation k — 2 or less will be willing to continue 
ta ing to that buyer in period 3 since the buyer is no older than k 
generations. This affects the original bargain because the seller’s ex¬ 
pected third-period price is changed to reflect the fact that, off equi¬ 
librium, he will always have someone to talk to in period 3 . 

The equilibrium prices solve k simultaneous equations, each ob- 
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tained from a bargaining analysis that matches a representative seller 
against a buyer of a particular age. Let us define as *,* the expected 
price to the seller in the third period, when he has matched with an 
ith-generation buyer. For i equal to A or A - 1, 

* 


*.* * 


X /w* 

1 - Su„A ’ 


(35) 


where v n k ~ 1 — 1 / 1 * — . . . — v**. For i less than k — 1, 

i+i »+ i 

*.* = X Pl hV ) k + (* "X V j*)pl* + 2)*- (36) 


where the last term reflects the fact that the seller is always free to 
rematch with his current partner in period 3 if no one younger ar¬ 
rives. The standard analysis then yields 


p,k = min 


'■•'■(‘-I) 


+ 8 2 x,a 


, * = 1.A, 


(37) 


which is a linear simultaneous system in the p, k s. 

The Ath regime is an equilibrium for all a 0 and a* such that (i) at the 
lower boundary p kk a /* and (ii) at the upper boundary s / ( *_ 1)t 

a direct analogue of the equilibrium conditions imposed earlier. 

We have not been able to characterize the solution to (37) as com¬ 
pletely as its specific one- and two-generation versions. We suspect, 
however, that within a regime all prices are nondecreasing in a (and 
oto) and that the younger a buyer is when he matches, the higher his 
utility—the “returns to search” result. 

The matching probabilities can be readily derived by extending the 
logic applied to the one- and two-generation regimes. Continue to let 
N r denote the number of active sellers and 0) the ratio of young 
buyers to N r (0] = n h /N r ). It is not hard to show that 0 ; , the ratio of 
buyers of thejth generation to N r , and v )k , the probability that a seller 
talks to a jth-generation buyer, satisfy the following recursive rela¬ 
tionships: 

- v )*> (38) 


v jk e ~ 9 '(1 - e~*”') 
— 


(39) 


and that the steady-state flow condition equating entry and exit of red 
sellers is given by 


n r = Nr - N r exp| - X e ;) 


(40) 
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or, if we divide by n<, and rearrange, 

0i = 6 0 - 0o|exp(- X 6 >)j- (41) 

To solve for the 0's and vs, we can substitute from (38) and (39) into 
(41). 

By extending previous results, we can also derive the relationship 
between a* (inflation in the Ath regime) and a 0 . Aggregate money 
holdings in the blue market in period / + A are 

a 0 pt*-n-2^jnt> + «o pn+k-s(^jn h + . . . 


A 

+ Oops,-+ a Q pb,-i(l - 

where v lk is the f raction of red sellers who, in the A-generation regime, 
successfully match with blue buyers of the tth generation. Equating 
nominal supply of and demand for the blue good, we obtain 


k 



The generalized versions of the expected utilities follow directly 
from the previous section: 

k 

(8/a) X A* 1 '-* 

EV,k = -——r-, (43) 


1 - 5 


(i - X v <> 

\ I a 1 




(44) 


We have simulated our economy with the four 0 ( , values 1.01, 1.10, 
2.00, and 10.0 discussed earlier to cover lower ao and higher A values. 
One important result has emerged: the red seller’s expected utility 
has an interior maximum at a value of A between two and eight (de¬ 
pending on 0 O , and higher for lower 0 O )- That is, there is a range of in¬ 
flation rates that maximizes the red seller’s welfare (see fig. 3). 


IX. Conclusions 

This papier examines distributional effects and welfare costs of infla¬ 
tion by focusing on the organization of exchange. More specifically, 
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we concentrate on the different bargaining powers associated with 
different roles in transactions, and by emphasizing the importance of 
time, we allow the trading technology to respond to the increase in 
nominal values. 

The framework lends itself to a number of worthwhile extensions. 

First of all, the model could be used to study the effect of inflation 
on the adoption of different trading technologies. Suppose that 
agents were born with an endowment of labor that could be employed 
in producing either a blue or a red good. As inflation increases, en¬ 
try in the decentralized sector would rise, and one-to-one bargain¬ 
ing would become more widespread. However, this would negatively 
affect the probabilities of sale. The final equilibrium, if one ex¬ 
ists, would presumably lead to the conclusion common to much anec¬ 
dotal evidence that inflation reduces efficiency by decentralizing ex¬ 
changes. 

A second interesting question concerns the optimal switch to a con¬ 
stant-value currency unit. Even if strictly defined barter were ex¬ 
cluded (e.g., by variety in production and specific preferences in con¬ 
sumption), still at high inflation there will be strong incentives toward 
adopting a stable exchange unit: foreign exchange, if available, gold, 
or a specific good. Suppose that each agent had to sell his endowment 
through a bargaining game and then to purchase his consumption 
again through bargaining. Then inflation would play in his favor in 
the first stage but against him in the second. Whether, and when, the 
economy would adopt a stable means of exchange would depend on 
the relative force of these two opposite effects. 
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In both these extensions, the government should be allowed a more 

active role than the one played in this paper. 
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The Incidence of Sanctions against Employers 
of Illegal Aliens 


John K. Hill 
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This article assesses the significance of sanctions against employers 
of illegal aliens for resource allocation and income distribution in the 
United States. Data from the 1980 Census of Population are used to 
identify the industries likely to he monitored most closely by the 
immigration authorities. A general equilibrium incidence analysis 
then is carried out using alternative assumptions about the overall 
level of enforcement. Estimates are made of the effects sanctions will 
have on the real wages of legal U.S. workers. 


I. Introduction 

The Immigration Reform and Control Act of 1986 marked a new 
phase in policies to control illegal immigration into the United States. 1 
Previous efforts had relied on border patrol and deportation of cap¬ 
tured illegal aliens. The new law added sanctions against employers of 
illegal immigrants to the means already available to authorities. Prior 


We wish to lhank Jeff Gunther tor research assistance and two anonymous reterees 
(or helpful comments. Much of the research on this article was conducted while Pearce 
was at the Federal Reserve Bank of Dallas. Any views expressed in the article are solely 
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Dallas, the Federal Reserve System, or the Federal Home Loan Bank of Atlanta. 

1 For a summary of the new immigration law, see U.S. Congress (1986). 
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to the 1986 law, employers who knowingly hired illegal aliens faced 
no risk of punishment. ' *' 

Early evidence indicated that the new law was having significant 
effects. Border crossings declined, and there were numerous reports 
of employers firing undocumented workers (Recio et al. 1987). The 
efficacy of sanctions is likely to change over time, however. 2 Employ¬ 
ers will adjust their behavior as they become familiar with the law and 
the pattern of enforcement. Millions of business establishments are 
covered by the law, but the enforcement budget provides for only a 
few thousand agents. The authorities will have no choice but to en¬ 
force sanctions selectively. 

In this article we assess the long-run significance of sanctions for 
resource allocation and income distribution in the United States. We 
identify the industries likely to be most affected by sanctions and 
provide a sense of the magnitude of their possible effects on wages of 
legal U.S. workers. A key feature of the analysis is the differential 
nature of optimal enforcement. 

In studying the effects of sanctions, we confront a number of diffi¬ 
cult choices regarding basic issues in enforcement. Our general phi¬ 
losophy is to impose on the authorities only one significant handicap: 
a limited enforcement budget. Other assumptions are biased toward 
making sanctions more successful in achieving their intended objec¬ 
tives. For example, we ignore the potential for counterfeiting docu¬ 
ments. We also ignore the possibility that the authorities will be less 
selective and efficient in their monitoring to avoid inducing racial 
discrimination by employers. 

Our basic approach is to view sanctions as a tax on the use of illegal 
labor by employers targeted for inspection by the authorities. The 
penalties are levied on a per worker basis. Thus if employers know 
the probabilities of detection and are risk neutral, they will respond to 
sanctions as they would to a tax levied at a rate equal to the fine times 
the probability of detection. Much of our analysis follows the standard 
theory of factor tax incidence. There is one additional layer of com¬ 
plication, however. It is not obvious which industries face high tax 
rates and which industries face low rates. The law does not specify a 
pattern of enforcement. The immigration authorities decide which 
industries to monitor and what intensity to use. 


* There is evidence that illegal aliens have already begun to revise their beliets re¬ 
garding the effectiveness of the new law. Field observations at Canon Zapata, the 
busiest illegal crossing point along the U.S.-Mexican border, indicate that the flow of 
illegal immigrants rose 15 percent during the first 6 months of 1988 to a level ap¬ 
proaching that in November 1986, when the law went into effect. Data on border 
apprehensions by immigration officials show a similar movement. See the report by 
Rohter (1988). 
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We assume that the goal of enforcement is to minimize the number 
of illegal aliens working in the U.S. economy. In this framework, a 
principal determinant of the intensity with which an industry is moni¬ 
tored is the concentration of illegal workers at an individual business 
establishment. Data from the 1980 Census of Population are used to 
estimate, for all industries thought to employ large numbers of illegal 
workers, the average number of illegal aliens working at an establish¬ 
ment shortly before the passage of the new law. This information is 
used to rank-order the industries in terms of the intensity with which 
they are optimally monitored. 

Having determined which industries make the best targets for in¬ 
spection, we estimate the economic effects of sanctions using alterna¬ 
tive assumptions about the size of the enforcement budget. In each 
enforcement regime, the economy is partitioned into two sectors: one 
consisting of industries to be heavily monitored and another com¬ 
posed of industries to be lightly monitored. The effective tax rates in 
the two sectors are chosen to conform to the rules for optimal en¬ 
forcement. Incidence calculations are made using a general equilib¬ 
rium model commonly used in studies of partial factor taxes. The 
analysis provides estimates of the effects of sanctions on production, 
the real wages of low- and high-skill labor, and the size of the illegal 
work force. 

II. Industry Enforcement Patterns 

Three assumptions are central to our analysis of enforcement: (1) that 
the budget is inadequate for achieving complete compliance, (2) that 
enforcement patterns are sufficiently predictable for employers to 
know their chances of being inspected, and (3) that the goal of en¬ 
forcement is to minimize the number of illegal aliens working in the 
United States. 

These assumptions are conservative and straightforward. Enforce¬ 
ment resources are limited in the United States for laws of all types. In 
other countries with sanctions, large numbers of illegal aliens remain 
because enforcement budgets are inadequate for obtaining wide¬ 
spread compliance. 3 The assumption of a predictable enforcement 
strategy is consistent with existing practices. Law enforcement agen¬ 
cies, from the Internal Revenue Service to the local police, display a 
strong tendency toward predictable emphasis on specific targets. 
Plans of the Immigration and Naturalization Service (INS) suggest 

See U.S. General Accounting Office (1986) for commentaries on the effectiveness 
or laws that govern the employment of alien workers in Western Europe, Canada, and 
Hong Kong. 
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that enforcement of sanctions will be selective and predictable. 4 5 * * * Our 
formulation of the goal of enforcement is the one that best accords 
with congressional intent. It is also consistent with criteria to be used 
by the General Accounting Office (GAO) in its evaluation of sanc¬ 
tions. 9 

A. The Enforcement Model 

We now develop an enforcement model that contains the essential 
features of optimal constrained enforcement but is simple enough to 
be integrated with the general equilibrium models used in tax inci¬ 
dence studies. Business establishments are assumed to be identical 
within industries. Let denote the fraction of all establishments in 
industry j to be inspected. Thus ir ; represents the probability of in¬ 
spection for any individual employer. Employers know the tt ; and 
take them as given. Employers also know when they are hiring an 
illegal alien and are risk neutral in this decision. The full cost of 
employing an illegal alien is simply (w + it J), where w is the wage 
paid to illegal workers and / is the fine per detected violation. Sanc¬ 
tions are seen to operate as taxes on illegal labor, at rates that vary 
across industries with differences in auditing frequencies. 

National demand for illegal labor is represented by X a ; E r where a, 
is the number of illegal workers per establishment and E, is the num¬ 
ber of establishments in industry The enforcement agency chooses 
the it ) with the objective of minimizing X a,E r Establishment inspec¬ 
tions are subject to a budget constraint. We assume that all inspections 
consume the same quantity of resources and that a maximum of R 
inspections are possible under the given budget. The agency’s con¬ 
straint can then be written as X itjE, = R. 

To carry the analysis further, we need a production theory that can 
be used to relate a ; and £ ; to tt,. To this end, assume that production 
functions are linear homogeneous and that the ratio of establishments 
to industry output is fixed for each industry. The demand for illegal 
labor then has a simple structure. The term o, is proportional to the 
quantity of illegal workers that minimizes unit cost, a function that 
conveniently summarizes the technical substitution possible between 

4 Plans for fiscal year 1988 provided for the following allocation of INS enforcement 
resources. Sixty percent of INS staff time was to be devoted to investigations of sus¬ 
pected violators. The remaining 40 percent was to be used in reviewing 1-9 forms, with 
half of the reviews to be done in industries known to employ illegal aliens and the other 
half to be done at random (see U.S. General Accounting Office 1987, pp. 27-28). 

5 Before offering a favorable evaluation of the sanctions program, the GAO expects 

to see a decline in INS border and workplace apprehensions. The GAO also would like 

to compare estimates of the size of the illegal alien population made before and after 

the introduction of sanctions (see U.S. General Accounting Office 1987, pp. 41-47). 
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illegal labor and other factors. The term E, is proportional to the level 
of industry output. 

A final issue to be addressed concerns the amount of information 
available to the authorities when they decide on an allocation of in¬ 
spection resources. We assume that the authorities know the current 
values of the a, and have a general understanding of the reductions in 
employment of illegal workers made possible through technical factor 
substitution (specifically, they know the economywide compensated 
elasticity of demand for illegal labor). The authorities do not antici¬ 
pate induced changes in industry outputs, the illegal wage, or any 
other factor price. In an enforcement equilibrium, however, inspec¬ 
tion patterns must be optimal when all variables assume market¬ 
clearing values. 

The enforcement model now has been defined. I'he characteristics 
of an optimal solution are easy to derive and can be given without 
proof. Linder a broad set of possible budgets, it is optimal to monitor 
only a subset of industries. Whether or not a given industry is moni¬ 
tored depends on the number of illegal workers that would be em¬ 
ployed in an average establishment if there were no threat of audit. 
Let denote this value and let m represent the index of the marginal 
industry. Then it is optimal to monitor industry j if and only if af 
exceeds a'i,. The value of al is determined by the size of the enforce¬ 
ment budget. I'he larger is the budget, the lower is a* and the larger is 
the set of industries to be monitored. 

Within the set of monitored industries, auditing frequencies are 
chosen to equalize the reductions in employment of illegal workers 
expected to result front an additional inspection. Given our assump¬ 
tions about which economic adjustments the authorities anticipate, 
this condition is equivalent to 


where /, is the effective tax rate in industry that is, t } = •n,flu>. 
Equation (1) yields two basic results front the literature on law en- 
fotcement: (1) the frequency of inspection should be greatest in those 
industries with the greatest expected violation, represented here by 
the number of illegal workers per establishment; and (2) some non- 
compliance is optimal, even within the set of monitored industries. 

R. Identifying the Monitored Industries 

To identify the industries likely to be closely monitored by the au¬ 
thorities, we estimated the number of illegal workers per establish¬ 
ment in U.S. industries shortly before the passage of the new law. 
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Critical to the calculations was information on the industry distribu¬ 
tion of illegal aliens. To have a fine level of industry detail, we esti¬ 
mated our own distribution from census data. 

The industry distribution of illegal workers was derived using the 
Public Use Sample of the 1980 Census of Population. People born out¬ 
side the United States and whose ability to speak English was poor 
were used as a proxy group for illegal aliens. 6 Pearce and Gunther 
(1985) show that this method generates occupational and industrial 
distributions that are consistent with those of previous studies. We 
have also cross-checked geographic and nationality distributions gen¬ 
erated by this method with those estimated by Passel and Woodrow 
(1984) and found close correspondence. Although the proxy group 
may include refugees and legal aliens, our approach is still valid as 
long as these people are distributed across industries in the same way 
as illegal aliens. 

Table 1 shows all nonfarm industries employing at least 0.75 per¬ 
cent of the proxy group, henceforth referred to as the illegal work 
force. Agriculture was excluded from the list because the new law 
includes special provisions to maintain a supply of immigrant workers 
to fruit and vegetable producers. The 31 industries in the table ac¬ 
count for 74 percent of illegal nonfarm employment. The four indus¬ 
tries employing the largest number of illegals are apparel manufac¬ 
turing, restaurants, construction, and food processing. 

Estimates of the number of illegal aliens per establishment are 
shown in the second column. These figures were obtained by multi¬ 
plying the first column by an estimate of illegal nonfarm employment 
in 1986 and then dividing the result by the number of business estab¬ 
lishments. 7 The estimates of illegal workers per establishment show a 
considerable variance. The large variance implies that optimal en¬ 
forcement of sanctions will not be uniform and that a broadening of 
surveillance will prove increasingly expensive. 

The figures in the second column also reveal that manufacturing 
industries are predominant among industries with a large number of 
illegals at an individual establishment. Of the 21 industries with at 


6 McCarthy and Valdez (1986) also use language proficiency as a selection variable to 
gain information on the illegal alien population. 

7 We assume that 4 million illegal aliens were employed in U.S. nonagricullural 
industries in 1986. Census Bureau research indicates that between 3 and 5.5 million 
illegals were residing in the country at that time. Recognizing a possible downward bias 
in the census figures, we chose 7 million as an upper bound for the size of the illegal 
population in 1986. Our estimate of total illegal nonfarm employment follows by mak¬ 
ing allowances for those working in agriculture, those not working at all, and those 
receiving amnesty. The figure we use is probably too high, but we prefer to err on the 
high side. 



TABLE 1 

Illegal Alien Workers in U.S. Industries 


Industry 

Percentage of 
All Illegal 
Aliens in 
Nonagricullural 
Employment* 

(1) 

Illegal 

Aliens per 
Establishment’ 
(2) 

Illegal 
Aliens per 
Hundred 
Workers* 

(3) 

Canned foods 

1.97 

37.7 

26.0 

Leather and footwear 

1.81 

26.5 

25.5 

Apparel 

11.52 

18.9 

28.7 

Computers 

.80 

18.4 

8.2 

Meat products 

1.52 

16.7 

15.6 

Grain and bakery products 

2.20 

13.6 

18.2 

Transport equipment 

2.61 

11.0 

3.9 

Textiles 

1.81 

10.9 

7.1 

Primary metals 

1.53 

8.6 

4.7 

Hospitals 

1.67 

8.1 

1.4 

Furniture and fixtures 

2.03 

8.1 

14.0 

Electrical machinery 

3.33 

8.1 

5.7 

Paper and allied products 

1.09 

6.8 

6.1 

Miscellaneous manufacturing 

2.46 

6.2 

15.7 

Chemicals 

1.38 

4.6 

4.2 

Rubber and plastics 

1.38 

4.1 

7.9 

Beverages and miscellaneous 




foods 

.91 

3.6 

6.6 

Department stores 

.87 

3.5 

1 6 

Fabricated metals 

2.75 

3.1 

7.6 

Educational institutions 

2.46 

2.6 

1.2 

Hotels and motels 

2.10 

2.4 

7.0 

Services to buildings 

1.30 

1 6 

3.0 

Wholesale grocers 

1,45 

1.5 

7.3 

Landscaping 

1.01 

1.5 

13.8 

Lumber and wood products 

1.09 

1.3 

5.4 

Eating and drinking places 

8.04 

1.0 

5.5 

Cleaners 

1.01 

.9 

8.3 

Private households 

1.96 

.8 

10.3 

Construction 

7.03 

.6 

4.2 

Auto repair 

1.09 

.5 

6.1 

Retail grocers 

1.52 

.5 

2.4 

All other nonfarm industries 

26.30 

.3 

2.1 
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least two illegal workers per establishment, 17 are in manufacturing. 
Thus to the extent that enforcement is more thorough on large em¬ 
ployers, illegals in manufacturing will be displaced more extensively 
than illegals in other sectors of the economy. 

Our analysis assumes that there are only fixed costs in auditing 
establishments. If monitoring costs vary with establishment size, then 
an optimal enforcement pattern will also call for heavy surveillance of 
industries with a large ratio of illegal workers to total workers. As the 
third column in the table shows, however, this measure is highly cor¬ 
related with the number of illegals per establishment (a simple corre¬ 
lation coefficient of .74), so incorporating it into the selection process 
would not cause major changes in the rankings. 

The figures in the table are based on national totals for numbers of 
illegal workers and business establishments. This reflects our belief 
that, in a long-run analysis of sanctions, it is necessary to assume a 
high degree of geographic mobility in labor and capital. Nevertheless, 
the initial geographic distribution of illegal aliens is highly skewed. 
And in the short run, enforcement efforts are likely to be concen¬ 
trated in particular states as well as particular industries. To deter¬ 
mine how sensitive our ordering of industries is to the geographic 
concentration of illegal aliens, we computed numbers of illegals per 
establishment using information only from California, Florida, Il¬ 
linois, New York, and Texas. These states account for three-quarters 
of the illegal work force but only one-third of all U.S. workers. The 
new ordering of industries differed little from the one in the table. Of 
the 31 industries, 14 failed to change position, 13 moved up or down 
by only one or two positions, and only two moved more than three 
positions. 

III. Market Adjustments 

The introduction of sanctions will trigger adjustments in the markets 
for illegal and legal labor. Most directly, sanctions reduce the demand 
for illegal workers among monitored industries. This drives down the 
illegal wage rate. As their wage falls, some illegal workers withdraw 
from the U.S. labor market. Others find employment in industries in 
which enforcement is weak and the cost of illegal labor has declined. 

It is less clear how sanctions will affect the wages of legal workers. 
Industries facing a higher cost of illegal labor have an incentive to 
employ more legal workers that are substitutable for illegals. But 
there is the opposite incentive in industries in which the cost of illegal 
labor has fallen. The net result depends on the relative strengths of 
the two effects. 
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A. The General Equilibrium Model 

The model we use to quantify the economic effects of employer sanc¬ 
tions is similar to the one used by Harberger (1962) in his analysis of 
the corporation income tax. There are two sectors in the U.S. econ¬ 
omy. Sector A consists of industries subject to heavy monitoring by 
the immigration authorities. Sector B consists of all other nonagricul- 
tural industries, with enforcement there light or negligible. What is 
meant by the terms “heavy” and “light" will be made clear in Section 
1VA. 

There are constant returns to scale in production. Each sector em¬ 
ploys four factors: illegal immigrant labor (/), legal low-skill labor (L), 
legal high-skill labor (//), and capital ( K ). We use educational attain¬ 
ment to define skill classes. Legal workers are classified as high-skill if 
they have completed at least 4 years of high school. 

We follow Jones (1965) in choosing the mathematical form of the 
equilibrium conditions: 


r, a*a + c,b*b ~ Nl ’ * = L ’ H > 
where N, = F(u>i/e(p)), = Af,, A/ )y = N lh 

dp) 

1 c lA w, + c /A w,t A = p, 

2 C,bW, + C;bW/(b = 1, 


(2)-(4) 


r = «</>)• ' ( 8 ) 

Equations (2)—(4) are market-clearing conditions For the three labor 
markets. 1 he left-hand side of each equation details the aggregate 
demand for a particular labor group. 1 he demand for labor of type i 
by firms in sector / is expressed as the product of the quantity of labor 
that minimizes unit production costs, c, r and the level of sectoral 
output, x r I he c, ; depend on relative factor costs; their elasticities 
convey information about the degree of technical substitution possi¬ 
ble between factors. 

Factors are perfectly mobile within tfie domestic economy, across 
both geographical regions and industries. Factor supplies are denoted 
Legal workers are assumed to be in fixed supply. The supply of 
illegal immigrant workers, on the other hand, varies directly with the 
real illegal wage. Real factor returns are defined by deflating the 
nominal return, w , by a cost-of-living index, e(p). For computational 
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purposes, good B was used as a numeraire. Thus the term p in e(p) 
denotes the relative price of good A. < 1 

As indicated by equation (5), capital is assumed to be in perfectly 
elastic supply at a constant real return. This allows for the possibility 
that some capital may be withdrawn from the U.S. economy as immi¬ 
grant labor becomes scarce. Equations (6) and (7) are competitive 
profit conditions for the two sectors. In a competitive equilibrium, 
unit production costs must reflect market prices. Note that costs are 
defined to include the effective tax rates on illegal labor. Finally, 
equation (8) is a simple demand condition used to close the model. 

B. Key Parameter Values 

The equations of the incidence model must be fully specified before 
the model can be solved. Within the equations are functional expres¬ 
sions for the number of illegal workers per establishment, unit factor 
demands, the supply of illegal workers, the cost-of-living index, and 
relative product demand. What follows is a brief discussion of the 
values chosen for some of the key parameters in these functions. A 
more complete account of the model is available in Hill and Pearce 
(1988). 

Size of the illegal work force .—Our analysis assumes that 4 million 
illegal aliens were employed in U.S. nonagricultural industries shortly 
before the passage of the new law. Illegal workers represented only 4 
percent of total U.S. nonfarm employment at that time. Their share 
in domestic value added was even smaller, roughly 2 percent. These 
basic facts ensure that the effects of sanctions on gross domestic prod¬ 
uct and income will be modest, whatever the level of enforcement. 

Illegal aliens are important, however, in the market for low-skill 
labor. By our estimates, there were three illegal workers for every 10 
legal low-skill workers when the law was passed. Sanctions then may 
have significant effects on low-skill wages, provided that illegal and 
legal low-skill workers are highly substitutable and that the law is well 
enforced. 

Elasticities of factor substitution .—These parameters are crucial for 
determining the direction and magnitude of the effect sanctions have 
on the real wages of legal U.S. workers. In a survey of econometric 
studies of factor substitution, Hamermesh and Grant (1979) conclude 
that high-skill labor and physical capital are each substitutable for 
low-skill labor and that high-skill labor is less substitutable for capital 
than low-skill labor is. In the spirit of these conclusions and in accor¬ 
dance with the general magnitudes of existing estimates, we chose the 
following values for the (Allen) elasticities of substitution between 
legal low-skill labor, legal high-skill labor, and capital: cr Lf1 = .75, 
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o lJC = 1.0, and a HK = . 25 . Estimates of substitution elasticities are not 
available by detailed industry. Therefore, the chosen estimates were 
assumed to apply to all industries. 

An indirect approach was necessary to obtain estimates of the tech¬ 
nical substitution possible between illegal immigrants and other fac¬ 
tors. In terms of education, illegal immigrants are very similar to legal 
low-skill workers. Therefore, we assumed that the substitution elas¬ 
ticities between each of these two labor groups and high-skill labor 
and physical capital are the same. This left us with a IL , the elasticity of 
substitution between illegal and legal low-skill labor. Given values for 
a IH , <r /x , and the distributive shares of all factors, a, L can be deter¬ 
mined from a knowledge of the compensated elasticity of demand for 
illegal immigrant labor, e. In our base case, we assumed a value of 
- 1.5 for €, which, in turn, implies a value of 11.8 for a lt . This pro¬ 
vides for a high degree of substitutability between illegal and legal 
low-skill labor. A more moderate value for <T//. was considered in a 
sensitivity exercise. 

Elasticity of supply of illegal immigrants. —This parameter is crucial for 
determining how effective a given enforcement effort will be in re¬ 
ducing the illegal alien work force. In their survey article, Krugrnan 
and Bhagwati (1976) conclude that elasticities of migration with re¬ 
spect to destination earnings generally lie between 0.5 and 2.0. This is 
consistent with the results of Greenwood and McDowell (1982), who 
find an elasticity of reported emigration from Mexico to the United 
States of 1.4. Given that our model ignores the downward pressure on 
foreign wages that would result from a more restrictive U.S. immigra¬ 
tion policy, whatever elasticity is chosen should be adjusted down¬ 
ward. In our base case, we used a value of 1.0 for the elasticity. 

IV, Incidence Analysis 

In this section we provide a numerical analysis of the effects of em¬ 
ployer sanctions on resource allocation and income distribution in the 
United States. The analysis combines the enforcement theory pre¬ 
sented in Section II with the general equilibrium theory described in 
Section III. The enforcement model produces a set of effective tax 
rates that are optimal given particular values for factor prices, com¬ 
modity prices, and industry outputs. The general equilibrium model 
produces values for prices and outputs given particular values for the 
tax rates. An enforcement equilibrium is obtained by solving the two 
models simultaneously. 8 

This is similar to the approach used by Graetz, Reinganum, and Wilde (1986) and 
Dubin and Wilde (I960) in their studies of auditing and income tax compliance. In 
their models, audits have a deterrent eflect on income tax evasion. There is also a yield 
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A. Defining the Enforcement Regimes 

It is useful to evaluate the effects of sanctions under different levels 
of enforcement. We consider three possible enforcement scenarios. 
They range from a minimal level of enforcement, in which only large 
employers are monitored for compliance, to a very ambitious cam¬ 
paign, with substantial surveillance carried out in all parts of the 
economy. 

Low enforcement .—Inspections are limited to industries that initially 
employ an average of two or more illegal aliens per business establish¬ 
ment. Sector A consists of all such industries, as identified in table 1. 
Sector B comprises all other nonagricultural industries. Included in 
the enforced sector, therefore, are virtually all the penetrated manu¬ 
facturing industries. Notable within the unenforced sector are the 
restaurant and construction industries. 

Following the discussion in Section IIA, the average tax rate for 
industries in sector A is determined from 1 + t A = n A /a m , where the 
marginal industry is defined as having an initial value of equal to 
2.0. The tax rate for industries in sector B is zero. 

Medium enforcement .—The budget is increased to allow for some 
monitoring of all the penetrated industries. Accordingly, sector A is 
broadened to include all the industries in table 1. The tax rate in this 
sector is again given by 1 + < A = a A /a m , but the marginal industry 
is now defined as having an initial value of a, equal to 0.5. Sector B is 
made up of all nonagricultural industries not listed in table 1. The tax 
rate in this sector is zero. 

High enforcement .—Sectors A and B are defined as in the medium- 
enforcement regime. However, the budget now is assumed to be large 
enough to support substantial monitoring efforts in both sectors. 
Because a A and a B differ, it is not optimal to monitor the two sectors 
with the same intensity. The optimal tax differential is determined by 
(1 + t A )/( 1 + / B ) = a A /a B . The overall level of taxation is chosen to 
achieve a 50 percent reduction in the illegal alien work force. 

B. Numerical Results 

Table 2 reports our numerical findings for the three enforcement 
scenarios. Part A of the table provides a breakdown of the change in 
the cost of employing an illegal alien. The change in cost for a particu¬ 
lar sector reflects both the effective tax rate in that sector and the 
change in the illegal wage rate. Part B details the effect sanctions have 


effect of audits, however, so that an increase in compliance levels leads to a decrease in 
the audit rate. These two relationships are solved simultaneously to produce equilib¬ 
rium audit rates and compliance levels. 
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TABLE 2 


Calculated Effects of Sanctions 



Low 

Level of Enforcement 

Medium 

High 

A. Cost of Illegal Labor 

Percentage change in illegal wage 

-10 

- 18 

-49 

Effective lax rale, as a percentage 
of new illegal wage: 

Seclor A 

65 

71 

321 

Sector B 

0 

0 

105 

Percentage change in cosl of 
illegal labor: 

Sector A 

48 

39 

114 

Serloi B 

-10 

- 18 

4 

B Employment of Illegal Aliens 

Initial share in sector A 

.48 

.74 

.74 

Percentage change in employment 
in sector A 

-44 

-39 

-68 

Fraction of those displaced leaving 
the U.S. labor market 

50 

.66 

.99 

Perccmage change in toial illegal 
employment 

- 11 

- 19 

-50 

U. Production 

Percentage change in output: 

Sector A 

-1 4 

- 1.5 

-16 

Sector B 

- 1 

.0 

- .5 

Percentage change in gross 
domestic product 

- .5 

-.7 

-2.1 

D. Wages ol Legal Workers 

Percentage change in real wage: 

Legal low-skill workers 

2.6 

4.6 

H 

Legal high-skill workers 

- 1 0 

- 1.4 

— 4 1 

Percentage change in aggregate 


real earnings of legal workers 

'.6 

- .8 

-2.3 


on lhe allocation of illegal workers. In our calculations, not all the 
illegals displaced from sector A leave the country. Some are absorbed 
y sector B. lo highlight this result, we express the percentage 
change in total employment of illegal aliens as the product of three 
terms. (1) the share of the illegal work force initially employed in 
sector A, (2)1the percentage change in employment of illegal aliens in 
sector A, and (3 the fraction of illegal workers displaced from sector 
A that leave the U.S. labor market. Part C of the table shows the effect 
sanctions have on domestic production. Part D reports the changes in 
the real wages of legal workers. All percentage changes represent 
deviations from an equilibrium without sanctions. 

Under low enforcement, sanctions reduce the supply of illegal 
workers by 11 percent. Even though 48 percent of the illegal work 
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force is initially employed in monitored industries, surveillance is 
thorough enough to achieve only a 44 percent decline in emplpy- 
ment. Taken together, these figures indicate that 22 percent of all 
illegal workers are displaced. Of these, only one-half withdraw from 
the U.S. labor market. The other half find work in sector B, where the 
cost of employing an illegal alien has fallen. 

When enforcement is low, sanctions have small effects on produc¬ 
tion and wages. Gross domestic product falls—because of an outflow 
of both illegal workers and capital—but it falls by only 0.5 percent. 
Sanctions do produce a rise in the real wage of legal low-skill workers, 
a result that is expected and desired by most supporters of immigra¬ 
tion reform. The size of the increase is small, however: only 2.6 per¬ 
cent. The gains in the low-skill wage are achieved at the expense of 
legal high-skill workers, who suffer a 1.0 percent real-wage decline. 
Results not reported reveal that a portion of this burden would also 
fall on capital were there immobilities in the supply of capital. 

The analysis indicates that sanctions reduce the aggregate real 
earnings of legal workers. Because the return to capital is fixed, sanc¬ 
tions must also reduce the collective earnings of ail national factors. 
This need not imply, however, that the new law fails to serve the 
national interest. The decline in national earnings may be an accept¬ 
able price to pay for a more equal distribution of income. In addition, 
if illegal aliens are net recipients of transfer payments, sanctions can 
raise national income by lowering the tax contributions of nationals. 

Under medium enforcement, surveillance is extended to all the 
industries in table 1, and the size of sector A is accordingly increased. 
Thus even though the percentage reduction in illegal employment in 
sector A is smaller under medium enforcement than under low en¬ 
forcement, the absolute number of illegal aliens displaced is greater. 
And with a smaller set of industries left unmonitored, more of those 
who are displaced leave the United States. The net result is a 19 
percent reduction in the illegal work force. 

Even with medium enforcement, sanctions have modest effects on 
wages and production. It is only when enforcement is high that the 
effects become significant. In this case, sanctions produce a 12.8 per¬ 
cent rise in the real wage of legal low-skill workers. The burden on 
high-skill workers is also more substantial, with their real wage falling 
4.1 percent. The reason these effects are more significant is that en¬ 
forcement is assumed to be strong enough to reduce the illegal work 
force by 50 percent. Monitoring efforts serve to double the cost of 
illegal labor for firms in sector A, resulting in a 68 percent decline in 
their employment of illegal aliens. Because sector A initially accounts 


for 74 percent of illegal employment, this means that 50 percent of all 
illegal workers are displaced. Under high enforcement, firms in sec- 

Accewirn 
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tor B are also monitored, and with sufficient intensity to prevent their 
cost of illegal labor from falling. As a result, virtually none of the 
workers displaced from sector A are absorbed by sector B. 

Marginal cost of enforcement .—Implicit in the move from low to high 
enforcement is a larger enforcement budget. To see what is involved, 
let the probability of detection in sector j be represented by the ratio 
of total inspections to number of business establishments in that sec¬ 
tor. If there is a constant fine, a ratio of the enforcement efforts in any 
two regimes, 1 and 2, can then be expressed as 

R 2 = S 
R 1 2 wjtjEj 

The right-hand side of the equation can be evaluated by combining 
data on number of business establishments with results from the inci¬ 
dence analysis. On doing so, we find that to move from low to me¬ 
dium enforcement, and to thereby achieve a 19 percent rather than 
an 11 percent reduction in the illegal work force, requires more than 
a fivefold increase in the number of work site inspections. To move 
from medium to high enforcement, reducing the illegal work force by 
50 percent rather than 19 percent, requires more than a fourfold 
increase in work site inspections. 9 These results somewhat overstate 
the incremental cost of obtaining broader compliance. Large estab¬ 
lishments take more staff time to inspect than small ones, and large 
establishments are concentrated among industries in the low regime. 
Nevertheless, it is clear that the marginal cost of enforcement rises 
sharply with the fraction of the illegal working population to be re¬ 
moved from the domestic economy. 

Sensitivity experiments .—Two of the parameters most crucial to an 
analysis of employer sanctions are the elasticity of substitution be¬ 
tween illegal and legal low-skill labor, oy/, and the elasticity of immi¬ 
grant supply. To determine the degree of sensitivity in our earlier 
results, we repeated our analysis of medium enforcement using alter¬ 
native values for the two parameters. To make the comparisons 
meaningful, the tax rate in sector A was chosen to provide the same 
enforcement effort as in the base case. 

As previously explained, a tl was derived from a value assumed for 
the compensated elasticity of demand for illegal labor, e. In our initial 
calculations, we assumed a value of — 1.5 for t, which, in turn, implied 

9 The interested reader can verify the calculations with the aid of the following 
information. Under low enforcement, there are initially 275,000 business establish¬ 
ments in sector A. With medium enforcement, sector A contains 1,544,000 establish¬ 
ments. In the high-enforcement regime, there are 1,544,000 establishments in sector A 
and 3,115,000 establishments in sector B. Civen a fixed level of output per establish¬ 
ment, numbers of establishments in the posttax equilibria can be computed by using 
information from pt. C of table 2. 
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a value of 11.8 for a, L . As a sensitivity exercise, we chose a val,ue of 
— 1.0 for e, with the corresponding value for cT//. falling to 3.5. The 
most significant change in the results was in the legal low-skill wage. 
With factor substitution more limited, the benefits of immigration 
reform to legal workers were greatly reduced. The rise in the real 
wage of legal low-skill workers was less than one-fourth the amount of 
the base case. 

In a second sensitivity experiment, we reduced the value of the 
elasticity of immigrant supply from 1.0 to 0.5. The most notable 
change in the results was in the effect sanctions had on the size of the 
illegal work force. With the illegal labor supply less elastic, a given 
enforcement effort was only two-thirds as effective in reducing the 
illegal working population. Thus reductions in the illegal work force 
may prove even more expensive to obtain than our earlier results 
indicated. 


V. Conclusions 

The 1986 immigration law will reduce employment of illegal aliens in 
some parts of the economy but not in others. Limited budgets will 
force authorities to focus their enforcement efforts on industries with 
a large concentration of illegals at an individual business establish¬ 
ment. Manufacturing industries are predominant among large em¬ 
ployers of illegal aliens. Thus the contractionary effects of immigra¬ 
tion reform are likely to be felt most strongly in manufacturing. 
Sectors that employ illegals with a low concentration at the establish¬ 
ment level, such as services and construction, are likely to face weak 
enforcement and may absorb significant numbers of displaced aliens. 

Sanctions will raise the living standards of legal workers in low-skill 
occupations, but these gains are not likely to be large. For sanctions to 
increase the real wages of legal low-skill workers by 10 percent, illegal 
and legal low-skill labor must be highly substitutable and the law must 
be enforced well enough to reduce the illegal working population by 
one-half. The costs of immigration reform will be borne primarily by 
high-skill workers. The percentage reduction in their real wages will 
also be moderate, however. 

In studying the economic effects of sanctions, we have explicitly 
recognized the selective and differential nature of optimal enforce¬ 
ment. This approach was necessary because of the limited size of the 
enforcement budget and the uneven distribution of potential 
violators across the economy. Agencies enforcing other laws and reg¬ 
ulations often face similar constraints. Thus much of our analytical 
approach can be applied to these cases. By incorporating the optimal 
pattern of enforcement into an analysis of the economic effects of a 
law, it is possible to obtain a rich understanding of the law’s djstribu- 
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tional consequences, as well as a more accurate estimate of its aggre- 
gate effects. 
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Occupational Matching: A Test of Sorts 


Brian P. McCall 

University of Minnesota 


This paper develops a theory of job matching in which matching 
information has both job-specific and occupation-specific compo¬ 
nents. If occupational matching is significant, then the theory pre¬ 
dicts that for those who have switched jobs but remained in the same 
occupation, increased tenure in the previous job lowers the likeli¬ 
hood of separation from the current job. These predictions are 
tested using job tenure data from the National Longitudinal Survey’s 
youth cohort, in general, the data are consistent with the occupa¬ 
tional matching hypothesis. 


I. Introduction 

Job shopping is a common explanation for job turnover (see Reynolds 
1951). Recently, a number of studies have formalized this notion (e.g., 
Johnson 1978; Jovanovic 1979a, 19796, 1984; Viscusi 1979, 1980; 
Wilde 1979; Lippman and McCall 1981; Miller 1984). These studies 
of job shopping or job matching assume that matching occurs only at 
the job level. It seems likely that matching also takes place at the 
occupational level. This is suggested by the significant fraction of 
people who switch occupations when switching jobs (see Miller 1984; 
Shaw 1987). The purpose of this paper is to see whether any empir¬ 
ical evidence exists to support this notion of occupational matching. 1 


This paper is drawn from chap. 2 of my doctoral dissertation (McCall 19886). I would 
like to thank David Card, Michael Keane, Whitney Newey, Sherwin Rosen, and an 
anonymous referee for excellent comments on an earlier draft. I would also like to 
thank Bruce Meyer for providing his semiparametric hazard estimation program. I am 
grateful to the Industrial Relations Section of Princeton University for financial sup¬ 
port. 

1 For an empirical test of the job-matching hypothesis, see Flinn (1986). 
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To achieve this goal, a theoretical model of job matching is de¬ 
veloped in which matching information has both job-specific and 
occupation-specific components. If occupational matching is impor¬ 
tant, then the theory predicts that the likelihood of separation from 
the current job should decrease with increased tenure m the previous 
job only for those individuals who do not switch occupations when 
switching jobs (employers). The magnitude of this decrease increases 
with the arrival rate of occupation-specific matching information. In 
general, the effect of a change in the variance of occupation-specific 
matching information on this magnitude is ambiguous. 

The data used in the empirical analysis were derived from the 
National Longitudinal Survey’s youth cohort. The sample consists of 
those respondents who were full-time students at the start of the 
survey, entered the labor force, and worked for at least two different 
employers since leaving school. By means of a proportional hazards 
approach (see Kalbfleisch and Prentice 1980; Kiefer 1988), the deter¬ 
minants of the job separation hazard are analyzed for the second job 
worked after leaving school. For the full sample, an increase in tenure 
in the first job worked after leaving school significantly lowers the rate 
of separation from this second job. However, the magnitude of this 


effect is significantly greater for individuals whose occupation re¬ 
mains the same across both jobs. 

The remainder of this paper is organized as follows: Section II 
develops the theoretical model of the paper. Under the assumption 
that there is no job search (i.e., no wage uncertainty) and that both 
job-specific and occupation-specific matching information arrive once 
(independently of each other) at some random time, the structural 
form of the job separation hazard is derived for both those working 
their first job and those working their second job in an occupation. 

Section III describes the data, and Section IV presents the empir¬ 
ical results. Estimates are obtained using both parametric and 
semiparametric techniques (see Cox 1975; Prentice and Gloeckler 
1978, Meyer 1986), for occupational switching defined at the one¬ 
digit and three-digit 1970 census classification levels, and when unob¬ 
served heterogeneity or measurement error is accounted for (see 
Lancaster 1979; Heckman and Singer 1984a). 

Finally, Section V contains a summary and conclusions. 


II. Theoretical Model 

This section develops a model of job matching in which matching 
f ° n as j°b-specific and occupation-specific compo- 
After a s,m P le modei is formulated, the worker’s optimal sam- 
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pling strategy is designed. Next, the empirical implications of the 
theory for job turnover are discussed. ( 

For simplicity, assume that initial wages are known with certainty. 
Furthermore, there is no intraoccupational variation in these wages. 

Let w, denote the initial wage for the ith occupation, * = 1. N. 

Assume that a job (employer) switch that also results in an occupation 
switch is at least as costly as an intraoccupational job switch. This is 
reasonable whenever initial training is needed to work in an occupa¬ 
tion. Symbolically, the cost of working the first job in the ith occupa¬ 
tion is C], and subsequent job switches in the ith occupation cost c„ 
where C|,ac„i= l,... ,N. A job switch (with certain wages) entails a 
loss of one period’s wages. 

The worker receives two types of matching information. Both 
affect net wages additively. Job-specific information is matching in¬ 
formation that is relevant only to the current job (this type of 
information is independent across all jobs both within and across 
occupations). Job-specific information for the /th job in the ith occu¬ 
pation is represented by £ I; , a random variable identically distributed 
across all jobs within an occupation. Observe that £,> = £„ where £, has 
cumulative distribution function Fj with £(£,) = Oand var(£.) = n't, (or 
simply af) > 0. Job-specific information for the jth job in the ith 
occupation arrives at a geometric random time, T 1; , with parameter r,. 

Net wages also have an occupation-specific component. For the ith 
occupation (i = 1, . . . , N), this is represented by the random variable 
a), that is distributed with cumulative distribution function G„ E(a>,) - 
0, and var(u>i) = (or simply cr^,) > 0. The random arrival time, 7,, 
of the occupation-specific information is geometrically distributed 
with parameter p,. In this case, time refers to the total time worked in 
the occupation. 

The random variables, w„ £,*, T, k , k = 1,2.and T„ are mutually 

independent for all i - 1,... ,N; for all i ¥> j, (<a„ £,*, T ih ,k= 1,2,..., 

and T,) is independent of (t» ; , ^*, 7)*, k = 1,2.and Tj). Individuals 

sample jobs/occupations to maximize expected discounted income, 
where (3 represents the discount rate, 0 < (3 < 1. 

To simplify the discussion of the worker’s optimal sampling strat¬ 
egy, it is assumed that P(~d, < £, < d,) = 0 for some d, > 0. 2 Given this 

2 Formally, the optimal intraoccupational job-switching policy must be independent 
of a stopping reward (call it Z), when one considers the sequential decision problem of 
working in the ith occupation or stopping and receiving Z (see Whittle 1980). The value 
of d, that achieves this independence depends on the job-switching costs, c„ the arrival 
rate of occupation-specific information, p„ the discount rate, 0, and the arrival rate of 
job-specific information, r,. The empirical implications of the theory do not depend on 
this assumption. 
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assumption, the optimal sampling policy is a simple index policy. An 
index policy is one in which an individual assigns an index to each 
occupation and during period t works in that occupation with the 
highest index. Let Z,(*,(!)) denote the index for the «th occupation m 
state x, at time t. Then an individual will work in the ith occupation in 

period I when Z,(*,(/)) = ,/v[Z ; (x,(0)]- 

A complete characterization (Z,(x,(t)) for all x,(t)) of the optimal 
sampling strategy is unnecessary for deriving an empirical test for 
the presence of occupation-specific information. Intuitively, when 
occupation-specific information is significant, individuals with previ¬ 
ous experience in an occupation should be less likely to leave their 
current employer than those working in an occupation for the first 
time. In the former group, some occupational sorting has occurred, 
and departures tend to occur primarily forJob-specific reasons. In the 
latter group, however, no occupational sorting has taken place, and 
job separations will occur for both occupation-specific and job-specific 
reasons. Before characterizing these differences in job separation 
rates, consider the different circumstances that induce quits. 

In our model, an individual may leave a job when either job-specific 
or occupation-specific information arrives. For simplicity, assume that 
d, is sufficiently large that poor job-specific matches (£, < 0) always 
result in job switches, irrespective of whether or not occupation- 
specific information has arrived, and the arrival of occupation-specific 
information never results in an intraoccupational job switch for indi¬ 
viduals with good job-specific matches (5, > 0). 

Given these assumptions, an individual has two choices when 
occupation-specific information arrives: remain in the current occu¬ 


pation permanently or switch occupations. This decision depends not 
only on the attractiveness of alternative occupations but also on the 
quality of the job-specific match. Let Z* = max ; ^,[Z,(je,(<))] be the 
value of the index of the most attractive alternative occupation; let 
Z,(u),) be the value of the index for occupation i when occupation- 
specific but not job-specific information has arrived, and Z,(4„ to,) 
the value of the index when both job-specific information and 
occupation-specific information have arrived. In the former case, an 
occupation switch occurs when Z,(u>,) < Z*, whereas in the latter case, 
an occupation switch occurs when Z,(£„ o>,) < Z*. This optimal sam- 

p ing po icy together with Z, increasing in o>, under both circumstances 
gives the following proposition. 


problem referred to'hi of' 6 ? , section * s a special case of a sequential decision 
Oiitins and lonns lQ 7 a literature as the multiarmed bandit problem (see 

'2 Tr d - and l985 >- 
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Proposition 1. (a) The probability that an occupation switch will 
occur when occupation-specific information arrives, given that if ar¬ 
rives after an individual receives favorable job-specific information, is 
G,(w,(£,)), where «,(£,) satisfies 

«.(t) - Z*( l ~ P) - w. ~ C. (1) 


and & is the realized value of the job-specific information. ( b ) The 
probability that an occupation switch will occur when occupation- 
specific information arrives, given that it arrives before job-specific 
information, is G,(w,), where oj, satisfies 


w, = (1 - p)Z* - w, - 


r,(i l,dF, 

id, 

l - 3{l - ~[ 1 - 


( 2 ) 


Proof. See Appendix A. 

Define h\,(t) = Pr {separate from first job worked in occupation i at 
tenure t|no separation before <} and h 2 i(t\T*) — Pr {separate from 
second job worked in occupation i at tenure f|no separation before t 
and completed tenure T* for first job worked in occupation ?}. These 
are the job separation hazards for individuals working in their first 
and second jobs in an occupation, respectively. The next theorem 
characterizes these job separation hazards in terms of the underlying 
parameters of the model. 

Theorem 1. h„(t) = P*(t) and h 2i {t\T*) = z(T*)P*{t) + [1 - 
z(7'*)](?*(<), where 


P*(t) 


s,piG,( o>,) + r,F,( — d,) + p,r, j 

' G,(u>,(4,))dF,j 

(i -sj-v 

| Gdw.OdF, 


+ (1 ~ q , r l )s‘~ l r,F,(-d l ) + 


(3) 


and 


Q*(0 = s'r'rf.i-d,). 


z(T*) = 


qr + {1 - C,(®,)](1 - q' t ") 


T*\ 


(4) 

( 5 ) 


with q, = 1 — p, and s, — 1 — r,. 

Proof. See Appendix A. 

On reflection, it is clear that P*(t) > Q*(t) for all t, dQ*(t)/dt < 0 for 
all*, and dP*(t)/dt < 0 for t sufficiently large. Note that A 2 ,(<|0) = h u (t). 
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Differentiating A 2l (l|r*) with respect to T* gives the following propo- 

**^Proposition 2. dh 2 ,(t\T*)ldT* < 0 for all T* * 0. 

Proposition 2 is the key empirical prediction o. the theory. If oc¬ 
cupational matching is significant, then tenure in the previous job 
should have a negative effect on the likelihood of leaving the current 
job for individuals who do not switch occupations when switching 
jobs. Trivially, if the variance of the occupation-specific component, 
tr£, is zero, then the magnitude of this effect is zero. 

Although the existence of occupational matching implies that an 
increase in tenure in the previous job should reduce the likelihood of 
separation from the current job for occupational stayers, the effect of 
"small” increases in the variance of occupation-specific information 
on the magnitude of this reduction is, in general, ambiguous. For 
example, assume that w, is normally distributed and that £, is a ran¬ 
dom variable taking only the values a, and — a,, each with probability 
Vi. Then we obtain the following proposition. 

Proposition 3. (a) If Z*( 1 - (3) - w, - a, > 0, then d 2 h 2 ,(t\T*)/ 
3T*dv‘i > 0, for T* > ln{l/[2 - G(o>,)]}/ln q,. (b) If o>, < 0, then 
d 2 h 2 ,{t\T*)/dT*d(r'l < 0, for T* > ln{l/[2 - G(a>,)]}/ln q,. 

Proof. See Appendix A. 

If individuals have available attractive alternative occupations (so 
that Z* is high) or job-specific matching is relatively unimportant (a, is 
low), then only those individuals whose net wages increase by a suf¬ 
ficiently large amount will remain after the arrival of occupation- 
specific information. Under these circumstances, an increase in the 
variance of occupation-specific information increases the proportion 
of those who have received occupation-specific information among 
occupational stayers with a given tenure in the previous job (so that 
^2,(t|T*)/do2 < 0). However, the magnitude \dh 2 ,(t\T*)!dT*\ is actu¬ 
ally reduced for T * sufficiently large. 

Finally, the magnitude |aA 2 ,(<|T*)/aT*| depends on p„ the rate at 
which occupation-specific information arrives. 

Proposition 4. d*h 2l (t\T*)/dT*dp, < 0 for t < I Ip, and T* < 
- 1/ln q,. 

Proof See Appendix A. 

As pi increases, so does the likelihood that an occupational stayer 

Wt r ^ tenure * n the previous job has received occupation- 
specific information. 


HI* Data 

M Sed {o \ ? e em P iricaI analysis of this paper were derived 
he Nat,onal Longitudinal Survey’s (NLS) youth cohort. This 
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panel data set follows 12,686 youths aged 14-22 at the time of the 
1979 interview. At the time of this study, jwmel data for the Survey 
years 1979-85 are available. 

The data set is derived from the NLS work history tape with some 
additional variables added from the youth cohort tapes. The tape 
builds a “job array” from the retrospective information included on 
the cohort tapes. This job array gives a week-by-week accounting of 
the respondent’s work history over the 1979—85 period. The work 
history tape, however, omits some variables that are essential for the 
analysis that follows. Specifically, no education or marital status vari¬ 
ables are included on this tape. In addition, it does not classify job 
separations by type. Hence, this information is merged to the work 
history tape from the youth cohort tapes. 

Since the goal of the empirical work is to test for differences in job 
separation behavior between those switching employers, but not occu¬ 
pations, and those switching both employers and occupations, only 
respondents who have worked at, at least, two different jobs were 
included in the data set. 4 The two jobs used for this purpose are the 
first two jobs worked after leaving school. Analysis of these jobs is 
attractive because it is likely that new entrants into the labor market 
face significant occupation-specific matching uncertainty. Although 
some occupation-specific information may be revealed during school¬ 
ing, jobs held concurrently with school attendance are not analyzed in 
this study. Many of these jobs involve summer employment, which 
necessarily terminates after 2 or 3 months. It is unlikely that jobs of 
this type are related to an individual’s ultimate career plans. 

The sample is restricted to those respondents who were full-time 
students as of the 1979 survey, entered the labor force, and worked 
for at least two separate employers. Those who held multiple jobs 
concurrently were omitted from the sample as were any respondents 
with missing values. The final sample size was 1,667. Table 1 sum¬ 
marizes the data. 

A one-digit occupational transition matrix is given in table 2. This 
shows the percentage of respondents in occupation i as of the end of 
the first job (job 1) who are in occupation j as of the start of the second 
job (job 2). Entries along the diagonal show the percentage who have 
remained in the same occupation when switching jobs. As can be seen 
from the table, there is considerable variation in this percentage 
across occupations (from a low of 38.4 percent in sales to a high of 69 


* Since job information is updated yearly for those who do not switch, it is possible to 
get information on occupation switching within a job. This information was not ex¬ 
ploited for two reasons. First, most jobs did not overlap two survey periods, and so 
occupation status was observed only once. Second, it seems likely that switches of this 
type occur primarily for “stepping-stone" reasons. 
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TABLE I 
Summary of Data 


Whites (%) 

58.1 

Blacks (%) 

26.3 

Hispanic* (%) 

15.6 

Female (%) 

52.3 

High school graduates* (%) 

85.4 

College graduates* (%) 

15.6 

Average age at start of job 1 (yrs.) 

19.9 

Married at the start of job 2 (%) 

20.6 

Average tenure in job 1 (wks.) 

38.2 

Reasons for leaving job 1: 


Quit (%) 

41.4 

Fired or laid off (%) 

23.1 

Other reasons (%) 

35.5 

Occupation switchers job 1 to job 2 (%) 

41.0 

Unionized in job 2 (%) 

14.9 

Governmetu in job 2 (%) 

3.7 


* Hiffh tchool graduates are defined as ifiosc who have (tar pit-led 12 
or more years of edutausm by rhe year job 2 began. College graduates 
are defined at those who have rompleted lb or more years of education 
by the year job 2 began 


percent in professionals). Overall, at the one-digit level, 41.0 percent 
switch occupations when switching jobs. ‘ 

It may be of some interest to search for any ‘'asymmetries” in the 
transition matrix. This could give some rough indication of the exis- 
nce of stepping-stone” occupations. One might expect that if occu- 
p ion ! serves as a training ground for occupation j, then a larger 
pe centage would be observed to move from i to; than from; to X 

no/ah, m Wb e “ W ° Uld be larger than the < J' 0 «»)• Only one 
notable asymmetry presents itself in the data, between clerical and 

p.l , p s p „TTr ly ?i pCTcen : <•> z t x, oi„ 

only aboui SpaJZZZl 

WhCre ° ne m,ght s “ s P e «Pri- 
asymmetries are present. rerS,dn °P eral,ves t0 craftsmen, no clear 


IV. Empirical Results 

2' S o^ Paper. If matching at 

individuals with longer tenures in h am ° ng occu P atlon stayers, 
cantly less likely to leave the current ^ eV ‘° USJob shouId be 
Recail that /i 2 (tlT*) is the i«K • 

working at their 'Second job in an ha2ard for individuals 

J an occupation who had tenure T* in 











JOURNAL OF POLITICAL ECONOMY 

their first job. A first-order expansion of in h 2 (t\T*) around T* = 0 
yields 

In h 2 (t\T*) = in MO + ( 6 ) 

where a r . = d in h 2 (t\T*)/dT*\T* = 0. Proposition 2 of Section II 
implies that if occupation-specific information is significant, a r * < 0. 
Proposition 4 suggests that |o T .| increases the arrival rate of occu- 
pation-specific information. 

Of course, differences in job separation behavior might arise from 
other sources of population heterogeneity. It is assumed that this 
heterogeneity affects In h 2 (t\T*) additively. Thus for the ith individ¬ 
ual, i = 1. N, we have 

In M<in *.) = ln MO + a T' T * + X P;*/> ( 7 ) 

j- 1 .* 

or, equivalently, 

h 2 (t\T* x,) = A,(/) exp/a T .7'* + X P/*/') 1 ( 8 ) 

where x, is a A-vector of covariates with components x ; „ j = 1. k. 

Recall from Section III that job 1 and job 2 are defined as the first 
and second jobs (employers) worked after school is left, respectively. 
Let X 2 denote the separation hazard for job 2 and define SWITCH as 
an indicator variable that is one if the ith individual switches occupa¬ 
tions between job 1 and job 2 and zero otherwise. If we denote weeks 
of tenure in job 1 for individual i by TENURE1, then the results 
above imply that 


exp 


X,(/|TENURE1, SWITCH, *,) = \%{t) 

« x (1 - SWITCH) x TENURE 1 + p ;Xj| 

/-i.* 


(9) 


where \ 2 (t) is the baseline hazard function. The model used for esti¬ 
mation purposes, however, is 

M<| TENURE 1, SWITCH, x„ 0) = 

X expect, TENURE 1 + « 2 SWITCH x TENURE1 (10) 

+ a s SWITCH + Y 

(E(B ) C —^ 1 ^ Th ' uno ^ s ^ rvec * triable that may affect job separation 
(m lh Th,s s P ecifi cation allows for other possible effects of ten- 
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ure in job 1 on the job 2 separation rate from which the simple the¬ 
oretical model of Section II abstracted. A weaker prediction of the 
theory is that the effect of tenure in job 1 on the job 2 separation 
hazard is greater for occupation switchers (a 2 > 0). The specifica¬ 
tion of (10) also recognizes the possibility that some sources of popu¬ 
lation heterogeneity that affect job separations may not be observed. 

Before we turn to the estimates, consider the definition of an occu¬ 
pation switch used in the empirical estimation. An occupation switch 
is said to take place if the one-digit 1970 census occupadon classifica¬ 
tion listed at the start of job 2 differs from both one-digit classifications 
listed at the start and end of job 1 (if they should differ). Given this 
definition, along with the fact that jobs 1 and 2 are the first two jobs 
worked after leaving school, it is unlikely that those who have 
switched occupations will have had any significant work experience in 
the occupation listed at the start of job 2. The end of this section 
discusses how the results change when an occupation switch is defined 
at the three-digit level. 

A list of the other covariates (x,) used in the estimation of (10) is 
presented in Appendix B. These include the starting hourly wage for 
job 2, the number of weeks an individual was out of work between 
jobs 1 and 2, one-digit occupation and industry dummy variables 
for job 2, and indicator variables for whether job 2 was a union or 
government job as well as for race, sex, schooling, and marital status. 

Four different maximum likelihood estimates of the parameters 
of (10), corresponding to different assumptions about \ 2 (/) and 0, 
are presented in columns 1-4 of table 3. 5 The estimated baseline 
hazard functions are graphed in figures 1 and 2. Columns 1 and 2 
of table 3 assume that the baseline hazard function is a Weibull: 
k 2 (f) = p<J>(pt)* _1 . The estimates of column 1 do not control for un¬ 
observed heterogeneity (0 = 1), whereas the estimates of column 2 
assume that 0 has a gamma distribution. The estimates of columns 3 
and 4 are obtained by semiparametric hazard estimation techniques 
designed for discrete data (see Prentice and Gloeckler 1978; Meyer 
1986). Unobserved heterogeneity is not controlled for in column 3 
and is accounted for by a gamma mixing distribution in column 4. 
Estimates using Cox’s partial likelihood method are presented in col¬ 
umn 5. The estimates in columns 3-5 do not constrain the baseline 

5 The likelihood function can be derived directly from the hazard function. For 
example, suppose that ail heterogeneity is accounted for by the covariates (A = 1). 
Then the unconditional probability that the ith individual leaves job 2 at time t, is 

XM (TENURE!, SWITCH, x,) exp|-j A*(u|TENUREl, SWITCH, x,)duj. 



TABLE 3 

Parameter Estimates for Job 2 Separation Hazard 
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(.2265) (.4505) (.2450) (.3315) (.2553) 

Occupation '. 

Managers -.1744 -.3539 -.1464 -.2295 -.1468 

(.2172) (.3505) (.2354) (.2943) (.2047) 



TABLE 3 (CorUtnufd) 
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S.P./Non« 

Ubl/Nona 

Cox/Mona 


0 10 20 30 40 50 

Tanura Job 2 (4 «k intervals) 

Fig. I.—Baseline hazard estimates (no mixing distributions) 


-D- S, P. /Cams 
Hbl/Camaa 


0 10 20 30 40 50 

Tanura Job 2 <4vk intervals) 

Fig. 2 . —Baseline hazard estimates (with gamma mixing distribution) 

hazard to any particular parametric family. To reduce numerical 
computations, tenure in job 2 is grouped into 4-week intervals. 

The estimates of a i, a 2 , and a s are presented in the first three rows 
of table 3, respectively. Turning to the estimates that do not control 
for unobserved heterogeneity in columns 1, 3, and 5, we see that 
tenure in job 1 has a significantly negative impact on the job 2 separa¬ 
tion hazard. For those who have switched occupations, however, the 
magnitude of this effect is significantly smaller (at the 10 percent level 
on the basis of a one-sided test). The value of a 3 is not statistically sig¬ 
nificant in all three specifications. So for a given tenure in job 1, occu¬ 
pation switchers are more likely to leave job 2. 
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When unobserved heterogeneity is accounted for, in columns 2 and 
4 the estimate of o 2 increases and is significant at the 5 percent level. 
The estimates of a,, however, are significantly negative. Together, 
these results imply that individuals who remain in the same occupa¬ 
tion when switching jobs are less likely to leave job 2 only if tenure m 
iob 1 exceeds 1 year. This result is not surprising and could be due to 
the fact that occupation-specific matching is a more complex process 

than what was proposed in Section II. . 

Since the sample used for estimation purposes consists of only those 
individuals who had worked at least two different jobs in the time 
between leaving school and the 1985 interview date, selectivity bias 
may limit the usefulness of the estimates for analyzing the determi¬ 
nants of job separation. Nevertheless, a few observations merit men¬ 
tioning. First, the estimated baseline hazards portrayed in figures 1 
and 2 suggest that the Weibull distribution may be inappropriate for 
studying job separation. I he semiparametric estimates suggest that 
the baseline job separation hazard increases in the first 4 months of 


tenure and then tends to decrease thereafter. The Weibull baseline 
hazard, however, is restricted to be monotonic. Second, while the 
semiparametric baseline estimates do not change appreciably when 
unobserved heterogeneity is accounted for, the estimated baseline 
hazard changes dramatically when it is constrained to be Weibull. 
Finally, for all specifications, individuals with high initial hourly 
wages, as well as those who have switched jobs directly without any in¬ 
tervening out-of-work spell, are significantly less likely to leave job 2. 

Equation (5) embodies the assumption that all covariates affect the 
job 2 separation hazard proportionally. Some checks of this specifica¬ 
tion using methods contained in McCall (1988a) suggest that the-pro- 
portional hazards assumption is reasonable, 7 although there is some 
weak indication that the one-digit occupational dummies do not affect 
kg proportionally. K 


ki H r kn ' a " and S ‘ n 8 CI < l9S4a . 19844) have shown that estimates can vary consider¬ 
ably depending on the parametric family assumed for the distribution of 6. Though 
ihis result may lie true when the baseline hazard is modeled as a Weibull, the semipara¬ 
metric estimates should be more robust to the specification of the distribution of 0 since 
some ol the variability m the estimated effects of the covariates may be a result of 
baselme bazard < see Trussell and Richard 1985; Han and Haus- 
man 1986). Nevertheless a model with a Weibull baseline and two-mass-point mixing 

SfiSETSar* k lhC r0bustncss of the «*«"«*■ The estimates of the 
as obtained w.th this specification were similar to those obtained in col. 2 of table 3. 

likelihood methods'only l, ° nS °‘ were performed using partial 

The^estima'fe^l'!! 13 ^ 8 ^ Were obwmed for ( he services and clerical occupations. 
sumTZ, WaS alm0St doubie that o{ Steals in magnitude. This 

fied) of occunation T vanance . ^ l ^e conditions of proposition 34 are satis¬ 

fied) ot occupation-specific mlormation is greater for services 
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The empirical estimates of table 3 implicitly assume that all types of 
job separations are alike. Although this is true under certain circum¬ 
stances (see, e.g., Jovanovic 19796; Borjas and Rosen 1980), in general 
the model of Section II is a model of quits. To control for other types 
of job separations, a competing risks framework is used (see Elandt- 
Johnson and Johnson 1980; Kalbfleisch and Prentice 1980). The re¬ 
sults for job quitters are similar to those obtained in columns 2 and 4 
of table 3. However, the overall likelihood of quitting job 2 is greater 
for occupation switchers only among individuals with tenures of 2 
years or more on job 1. 

As a final check of robustness, estimates were obtained with occupa¬ 
tional switching defined at the 1970 census three-digit level. One 
might suspect more measurement error at the three-digit level, but 
there is no theoretical reason that the results should differ at this 
disaggregate level. Though not reported here, the estimates using the 
1970 census three-digit classifications lend additional support to the 
occupational matching hypothesis. When no adjustment is made for 
unobserved heterogeneity, the job 2 separation behavior of those who 
have switched occupations and those who have remained in the same 
occupation is not significantly different. However, significant differ¬ 
ences similar to those obtained at the one-digit level do emerge when 
unobserved heterogeneity is accounted for. Since “mixing” likelihood 
functions helps control for not only unobserved heterogeneity but 
measurement error, these results are consistent with the presumption 
of measurement error at the three-digit level. 


V. Summary and Conclusions 

This paper developed a theory of job matching in which matching 
information has both occupation-specific and job-specific compo¬ 
nents. If occupational matching is significant, then the likelihood of 
leaving the current job will decrease with increased tenure in the 
previous job for those who do not switch occupations between jobs. 

The National Longitudinal Survey’s youth cohort was used to test 
these predictions. Job separation rates of youths working at their 
second employer since leaving school were analyzed using a propor¬ 
tional hazards approach. Estimates were obtained using both para¬ 
metric and semiparametric hazard estimation techniques, when unob¬ 
served heterogeneity or measurement error was accounted for by 
mixing the likelihood function, and for occupation switches defined 
at both the one- and three-digit levels. In general, tenure in the previ¬ 
ous job had a significantly negative impact on separation rate from 
the current job. However, for those who had switched occupations 
between jobs, the magnitude of this effect was significantly less. Simi- 
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Ur results were obtained when job quits were analyzed separately 
using a competing risks approach. 

Estimation of the underlying structural parameters of Jr model of 
Section II was not done in this paper (see, e.g., Miller 1984; Wolp.n 
1987). This is left to future research. The empirical results of this 
paper suggest that a more complex occupation-specific matching pro¬ 
cess would be necessary for any structural estimation. This estimation 
might best proceed by employing simulation techniques (see, e.g., 

Pakes 1986). ... 

Finally, the results of this paper emphasize that, when one is model¬ 
ing the job matching process, independence assumptions must not 
ignore the data. Of course any dynamic analysis must ultimately in¬ 
voke independence. But whether this is done at an occupational or 
possibly industry or regional level, the data clearly indicate that at the 
job level the independence assumption is inappropriate. 


Appendix A 

This Appendix proves the propositions and theorem of the paper. 


Proof of Proposition 1 

a) By assumption, if £, 2 d„ an intraoccupational job switch is not optimal 
when occupation-specific information arrives. So 


Z,(i>„ £,) = 


W, + + <D, 

1 - P 


Set Z,(to,, {,) = Z* and solve for u>,. Q.E.D. 

b) By assumption, irrespective of to,, it is assumed that if an individual is 
poorly matched to a specific job (£, < -d,), then a job switch will occur. Let 
F,(w,, Z) be the optimal value function of working at least one more period in 
the tth occupation when id, is known and a stopping option is available each 
period that pays Z. Whittle (1980) has shown under these circumstances that 
the index associated with the tth occupation is the value of Z that makes one 
indifferent between continuing and stopping. Thus Z,(u>,) satisfies the recur- 
i e relation Z,(ui,) - V,(u>,, Z,(<d,)). Writing out V„ we have 


z -(«,) = <■>. + w, + p r ,F(-d,)Z,( to,) 

PMta + 0>.)[1 - n-d,)] + f IdF,} 
+ __ M 

1 - p 

+ 3d - r,)Z,(cD,). 


Solving for Z,(<d,) and noting that 
Z* proves the result. Q.E.D. 


an occupation switch will occur iff Z,(ui,) 


< 
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Proof of Theorem. 1 

1 r 

Before I prove theorem 1, some additional notation will be introduced. Recall 
from the text that h u (l) = Pr {separate from first job worked in occupation * at 
tenure <|no separation before and = Pr {separate from second job 

worked in occupation i at tenure ijno separation before t and completed 
tenure T* for first job worked in occupation i}. Also, let oi(t — 1) denote the 
event that occupation-specific information has been received by the start of 
the tth interval, ji(t - 1) denote the event that job-specific information has 
been received by the start of the /th interval. Bad denote the event that the 
information (either job-specific or occupation-specific depending on the cir¬ 
cumstances) was sufficiently bad to induce a quit, oi(i) denote the event that a 
person receives occupation-specific information in period t, jt(t) denote the 
event that a person has received job-specific information in period t, and t in¬ 
dicate the event that the person has survived up to the beginning of period l. 

The proof follows from applying the law of iterated expectations and from 
noting that the probability of an event is just the expectation of the indicator 
function of that event. The events conditioned on at time t are whether the 
individual has not received job- or occupation-specific information by the 
start of period t, has received one but not the other, or has received both. 
From this we have 

h u (t) = p\~oi(t - l) n ~ji(i - i)|/j 

x {P[m(t) D H Badji O ~oi(t - 1) D ~ji(t — 1)] 

+ P[ji(t) O ~oi(l) O Bad|l D ~ot(l - 1) fl ~ji(t - 1)] 

+ P[oi(t) n jilt) n Bad 1 1 n ~oi(l -1)0 ~ji(l - 1)]} 

+ P[~oi(t - 1) O ji(t - 1)|0 
x P{oi(t) O Bad|< O ~oi(l - 1) O ji(t - 1)] 

+ P{oi(t -1)0 ~ji(l - 1)|0 
X P[ ji(t) O Bad|t O oi(t -1)0 ~ji(t — 1)], 

(Al) 

where a tilde is the negation sign. 

Now for new entrants into an occupation, the probability of not receiving 
occupation-specific information by the start of the <th interval is q‘~' ■ Simi¬ 
larly, the probability of not receiving job-specific information by the start of 
the <th interval is s, . When job-specific information arrives, either before 
occupation-specific information or after, then the probability of a quit is 
Fi(~dj). If occupation-specific information arrives before job-specific infor¬ 
mation, then a quit will occur with probability C,(»,). On the other hand, if a 
“good" job-specific match is obtained, then the (unconditional) probability 
that the arrival of occupation-specific information in period t will induce a job 
quit is } d Gjio,{f{))dFJ{ 1 - F,(-d,)]. Substituting these facts into (Al) gives 
P*. 

To derive the expression for Ii2 1 (l|7'*), note that a person working at his 
or her second job in an occupation, who had tenure T* in the first job, will 
not have received occupation-specific information during the first job with 
probability q[’. By the memoryless property of the geometric distribution, 
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the quit behavior of these individuals is identical to that of those first en- 
tenne the occupation. The probability that an individual with tenure T* in 
the first job received favorable occupation-specific informauon during that 
job is (1 - q T ’) x [1 - G,(«,)]. Individuals of this type will leave their sec¬ 
ond job only'for job-specific reasons. The term h 2 ,(t\T*) is a weighted aver¬ 
age of these two types of individuals. Q.E.D. 


Proof of Proposition 2 
From theorem 1 we have 

h 2 ,(t\T*) = z,(T*)[P?0) - Q!(01 + Q*(D. 

So 

— T * ] - = [P*(t) - Q*«)]. (A2) 

dT* dT* 

But P*(t) - Q*(l ) is positive. So dh' 2 ,(t\T*)/dT* and dz,(T*)/dT* have the same 
sign. Differentiating (5) with respect to T* gives 

dz,(T*) _ lnq,g!’[l - G,fa,)] ^ 


* T * {[1 - G.fo,)] + G,(o>,)q! 




(A3) 


Q.E.D. 


Proof of Proposition 3 

a) Differentiating (A2) with respect to a'i yields 

1W\T*) - AC/’*) rp „ ft 


3T*d<rl BT*8ai 


ip*(t> - Qr«)i 


4 3z,(T*) \ B[P*(t) - Q?(t)] 
2 


dT* 


da 


to 


(A4), 


Since > Z*( 1 - P) - w, - a, > 0, 

eC,[Z*(l - p) - w, - a,] = »H{[Z«(1 - 3) - w , - 

and 


ct(T, 


< 0 




< 0, 


to* 3crJ, 

where 4> is the cumulative distribution for the standard normal. Define 


A - */£,(»,) + J, 


Then 


(I - G,(», 

r ' r ! G,(os,(l,))dF, -I- - 1. _ 

1 - F.d.) 

nt) - m = ? /- ( i _ ?fM . 

- -+m 


(i,))dF, 


So 
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Given the assumptions of this example, A reduces to ( 

A = s,‘G,(w,) + V*C l r,G,[Z*(\ - p) - w, - a,] 

+ (1 - sr')Gi[Z*(l - p) - w, - «,]. 

Hence, 

3 A t (cr2>) l/2 ] , „ *t>{[Z*(l - (3) ~ u>, ~ ot i ]/(cr2,) 1 '' 2 } 

- = 5, - -r Y2 S, T,-—— ---- 

. 9 * „ 9 * 1 - 9 


<fcr: 




dot 


+ (1 


1 _ .■«-!) - g) - W. - «,]/(g») ia } < Q 




Because d i z,(T*)/dT*dol = [a 2 2 1 (7'*)/37'*dG,(< s . 1 )][dG,({tf 1 )/d<ri], 


sign 


~ 3 2 2,(T*) ' 
„ dT*ikr'l. 


-sign 


a 2 z,(r*) ' 

dT*dG,(Vi). 


Differentiating (A3) with respect to G,(<j>,) gives 

S 2 z,(T*) = Inq^Til - 2qJ* + q!'G,(o},)l 

ar*aG,(to I ) {[l - Gj(»,)] + G,(«,) 9 n s 

This is negative when I — 2</, 7 * + ^ 7 *G,(<jj,) > 0. These results, along with 
(A3) and the fact that P*(l ) - Q*(t) > 0, establish that (A4) is negative if 
1 - 2 q] m + qf'G,(t£t) > 0 or T* > !n{l/[2 - C,( Bl )]}/ln q„ Q.E.D. 

b) Since Z*(l — p) — a/, — a, < <o, < 0, <9G,[Z*(1 — P) — w, - aJAJtr 2 
and dG,(^,)/doi are both positive. Hence, A[/’*(() - Q*(t)]/dol, is greater than 
zero. Substitute these results along with the fact that d*z,(T*)/c)T*dG,(y,) is 
negative when 1 - 2<jrf* + qJ‘G,{<S,) > 0 into (A4). Q.E.D. 


Proof of Proposition 4 

It is somewhat simpler to derive d 2 h 2l (t\T*)/dT*dq r Since p, = 1 - q„ 
d 2 h 2 ,(t\T*) = _ B 2 h 2l (t\T*) 
dT*dp, dT*dq, 

Differentiating bh 2 ,(t \ T*)/3T* with respect to q, gives 

I'-Tw - am ♦ J ffi l . V&- «gw . (A» 

"l °q, sr*dq, dT* dq, 

Differentiating (A3) with respect to q, and simplifying yields 

d\(T*) 


dT*dq, 


= [1 “ G,(«.)]?/ 


T* In q,[l - G,(y.,) - G,i Vl )qJ ‘} + [1 - C,(»,) 4 G,(w,) ? H 
{[1 ~ G,(«,)] + G.fe,)?, 7 *} 5 


Recall that 

py) - Q *(0 = ?; _, (1 - q.)A. 


(A6) 

(A7) 
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Hen ■ jpy) - m m _ ?l ) - 1 )A. (A8) 

dq, 

Substituting (A3) and (A6)-(A8) into (A5) and simplifying gives 

a a A gj (llT») = A I -- G,(«,)](T*In^,{[1 - G,(W,)] - G,(»,)^*} 
BT*9q, 

+ {[i - + G,(«,)?r> + - & ~ ^ 

x {[1 - G,(«,)] + G,(«,)?f’}) -5- {[1 - G,(«,>] + G,(Vi)ql } 3 - 

ll is sufficient that T* < - 1/ln q, and t < 1/(1 - ?,) = 1 >P< for d*h 2 ,(t\T*)/ 
dT*3y, > 0. Q.E.D. 


Appendix B 


This appendix describes the 
IV. 

1N1T1ALWAGE 

FULLTIME 

SWITCH 

TENURE 1 
HSCHGRD 

COLLEGEGRD 

WEEKSOUTWORK 

NOOUTWORKSPELL 

MARRIED 

NMARR1ED 

MALE 

FEMALE 

HISPANIC 

BLACK 


variables used tn the empirical analysis of Section 


Initial hourly wage for job 2 
Dummy variable that equals one if hours per 
week worked in job 2 were greater than 30 
Dummy variable that equals one if the re¬ 
spondent switched occupations between job 
1 and job 2 

Weeks of tenure in job 1 
Dummy variable that equals one if the re¬ 
spondent completed 12-15 years of school¬ 
ing as of the start of job 2 
Dummy variable that equals one if the re¬ 
spondent completed 16 years (or more) of 
schooling as of the start of job 2 
Measures the number of weeks out of work 
between job 1 and job 2 
Dummy variable that equals one if there 
were no weeks spent out of work between 
job 1 and job 2 

Dummy variable that equals one if the re¬ 
spondent was married as of the start of 
job 2 

Dummy variable that equals one if the re¬ 
spondent was not married as of the start of 
job 2 

Dummy variable that equals one if the re¬ 
spondent was male 

Dummy variable that equals one if the re¬ 
spondent was female 

Dummy variable that equals one if the re¬ 
spondent was Hispanic 
Dummy variable that equals one if the re¬ 
spondent was black 
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UNIONJOB2 Dummy variable that equals one if job 2 was 1 

unionized 

GOVTJOB2 Dummy variable that equals one if job 2 was 

a government job 

The following dummy variables are industry and occupation dummies that 
were set to one if the respondent’s classification number fell within the speci¬ 
fied range for job 2. The numbers represent the 1970 census industry and 
occupation classification codes. 9 


Industry 

Classification 

Mining 

047-057 

Construction 

067-077 

Manufacturing 

107-398 

Transportation/communications/public utilities 

407-479 

Trade 

507-698 

Finance/insurance/real estate 

707-718 

Business/repair services 

727-759 

Personal services 

769-798 

Entertainment/recreation services 

807-809 

Professional services 

828-897 

Public administration 

907-937 

Occupation 

Classification 


Managers 

201-245 

Sales 

260-284 

Clerical 

301-395 

Craftsmen 

401-575 

Operatives 

601-715 

Laborers 

740-765 

Farmers 

801-824 

Services 

901-965 

Household 

980-984 
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We study the behavior of a large trader with private information 
about the mean of an asset with a risky return. We argue that if the 
variability of the return is not too great, typically the trader will find 
it desirable to ensure that the market price does not reveal his infor¬ 
mation, that is, that a “pooling” equilibrium arises. Such an equilib¬ 
rium has the advantage of avoiding the incentive constraints that 
arise in "separating" equilibria, where information can be inferred 
from prices. Thus the efficient market hypothesis may well fail if 
there is imperfect competition. Despite the uninformativeness of 
prices, the other (competitive) traders are also better off in the pool¬ 
ing equilibrium than in any separating equilibrium, again if one 
assumes limited variability. 


I. Introduction 

The efficient market hypothesis holds that in markets with signifi¬ 
cant informational asymmetries (e g., securities markets) equilibrium 
prices aggregate information effectively. Thus a trader can infer all 
he needs to know about others’ information simply from observing 
prices. 

Most theoretical examinations of this hypothesis (e.g., Grossman 
1976; Radner 1979; Allen 1981) have posited perfect competition, in 
which individual traders are too small to affect market prices. By and 
large this literature has confirmed the informational efficiency of 
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markets: As long as there are “enough prices” (i.e., trade can be made 
contingent on sufficiently many events), comjietitive equilibrium is. 
generically “separating” in the sense that the function relating infor¬ 
mation to prices is invertible. Imperfect competition adds a new com¬ 
plication to the efficient market question because, when some traders 
are large, the amount of information conveyed by prices is to some 
degree a matter of their strategic choice. In fact, we shall argue that, 
when traders have rational expectations, there are good reasons to 
believe that the efficient market hypothesis breaks down with imper¬ 
fect competition. A large trader will typically find it advantageous to 
conceal his private information parameter by ensuring that the equi¬ 
librium price is not sensitive to local variations in this parameter; that 
is, he will induce an equilibrium of a (locally, at least) “pooling” na¬ 
ture. 

Suppose that the large trader’s parameter 0 is positively correlated 
with the mean value of a particular asset. At first sight, the bias in 
favor of a pooling equilibrium for this asset—one in which the same 
price emerges for both high and low values of 0—may seem puzzling. 
If the large trader is a net buyer of the asset, a pooling equilibrium 
has the apparent advantage that he can buy the asset for less than in a 
separating equilibrium if 0 is high (since in the pooling equilibrium 
the market price corresponds to an average of the high and low values 
of 0). But by the same token, he must pay more than in the separating 
equilibrium when 0 is low. Thus it might seem as though either a 
pooling or a separating equilibrium could be better depending on the 
particular utility functions, probabilities, and so forth. 

Actually, however, rational expectations require that, in pooling 
and separating equilibria, the expected price paid by the large trader 
be (approximately) the same. The reason is that if the sellers (small 
traders) are price takers, the market price simply equals their “mar¬ 
ginal cost” of selling (i.e., their marginal disutility). Thus if 0 takes on 
the values 0] and 0 2 with probabilities iti and tt 2 , the prices in a 
separating equilibrium are/>(0i) = MC(0j) and £(0 2 ) = MC(8 2 ), and 
in a pooling equilibrium, the price is p = iriMC(8i) + ir 2 MC(0 2 ). In 
either case, the expected price is MC(0i) + it 2 MC(8 2 ). (The qualifica¬ 
tion of approximate equality is made above because, in general, the 
marginal costs depend on the quantity sold, and, moreover, the equa¬ 
tion p = tti MC(0j) + tt 2 MC(0 2 ) implies risk neutrality on the part of 
the sellers, However, the approximation is good enough when 0i and 
0 2 are close.) 

Thus the important difference between separating and pooling 
equilibria lies not with prices but rather with quantities. In a pooling 
equilibrium, the large trader can buy all he wants at the market price 
(given the presumed linearity of the small traders’ disutility). But this 
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cannot be the case in a separating equilibrium because, in such an 
equilibrium, the price of the asset is low when the value of 0 is low. If 
the large trader could buy all he wanted at such a price, he would see 
to it that the price was always low, violating the hypothesized separat¬ 
ing nature of equilibrium. Hence, the trader must be quantity- 
constrained at the low price. This incentive constraint implies that, on 
average, he cannot buy as much in a separating equilibrium as in a 
pooling equilibrium. Hence, he will favor the pooling equilibrium. 

This simple argument assumed linear preferences on the part of 
sellers. However, it is suggestive of the general argument, which we 
provide below in proposition 6. For similar reasons it follows that just 
as the large trader prefers pooling, so do the other traders (proposi¬ 
tion 9). 

There is quite a substantial empirical literature on large trader 
behavior in asset markets (see Niederhoffer and Osborne 1966; Lo- 
rie and Niederhoffer 1968; Kraus and Stoll 1972; Scholes 1972; Grier 
and Albin 1973; Jaffe 1974; Dann, Mayers, and Raab 1977; Baesel 
and Stein 1979), but relatively few theoretical investigations. One set 
of papers assumes that the large trader can commit to a strategy ex 
ante (i.e., before the informational parameters have been realized). 
The solution concept is therefore Stackelberg equilibrium. Gould and 
Verrecchia (1985) develop a static model with a single risk-neutral 
large trader with private information (a specialist) and many risk- 
averse small traders. The specialist sets a price, and the others choose 
the quantities they wish to trade at this price. Pricing strategies are 
restricted to be linear in information with additive noise. It is found 


that, when the small traders also possess private information, the 
specialist profits by garbling his information (taking the noise term to 
be nondegenerate). When only the specialist has private information, 
he is hurt by adding noise. Grinblatt and Ross (1985) examine a 
similar model that also features a set of irrational or “noise” traders. 


Like Gould and Verrecchia, they restrict their analysis to linear strate¬ 
gies and find that introducing noise is not optimal. Cripps (1986) 
demonstrates that this result may rely crucially on linearity. In a re¬ 
lated model but with lewer restrictions on strategies, he finds that 
some pooling is desirable (see also Kihlstrom and Postlewaile 1983). 

One objection to these models is that it is difficult to see how the 
large trader can commit himself to a pricing strategy beforehand. To 
begin with, in many circumstances there is no “beforehand.” Often, a 
trader is in the market only because he has acquired private informa¬ 
tion. Before obtaining this information, he may not foresee his par¬ 
ticipation and so cannot contemplate what his strategy will be. But 
even if there is a well-defined ex ante period, the trader will generally 
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wish, once he acquires his information, to charge a price different 
from the one prescribed by his Stackelberg strategy. 1 . 

Another shortcoming of some of the papers is the restriction to 
linearity. The principal justification offered for looking ohly at linear 
strategies seems to be analytic convenience. 

Kyle (1985) overcomes the first drawback (although not the sec¬ 
ond). He constructs a dynamic model in which a large insider faces 
irrational players and risk-neutral market makers. Rather than as¬ 
sume that the insider can precommit to a strategy, he supposes that, at 
all dates, the insider’s behavior is optimal given the strategies and 
beliefs of the others; that is, he studies perfect Bayesian equilibrium 
(PBE). One interesting feature of the equilibrium examined is that 
the insider trades in such a way that his private information is incor¬ 
porated by prices only gradually. However, Kyle derives only the 
equilibrium in which traders use linear strategies (see also Altug 
1985). 

As we explain in Section IV, models such as Kyle’s and ours tend to 
exhibit many PBEs. Thus some restriction is necessary if any predic¬ 
tions are to be made. However, simply imposing linearity seems quite 
arbitrary. Our approach is to suppose that, by virtue of his market 
power, the large trader should be able to influence not only the mar¬ 
ket price but also traders’ beliefs. We maintain, therefore, that a natu¬ 
ral PBE on which to focus is the one most favored by the large trader. 
We study this equilibrium without imposing any a priori restriction on 
strategies. 

In Section II, we lay out a model of an asset market with a large 
“inside” trader and many small outsiders. The large trader is an in¬ 
sider by virtue of having private information about the mean of the 
asset’s return. In Section III, we define perfect Bayesian equilibrium 
and show that PBEs satisfy two monotonicity properties (proposition 
1). Section IV demonstrates that a completely separating equilibrium 
always exists in our model (proposition 2) and that a completely pool¬ 
ing equilibrium also exists (proposition 3) provided that the informa¬ 
tion parameter 9 does not vary too much (proposition 4 shows what 
can go wrong with excessive variability). There is also a continuum of 
intermediate equilibria. However, proposition 5 establishes that the 
large trader’s favorite equilibrium will always involve complete pool¬ 
ing or complete separation. In Section V, we formalize the intuitive 


1 One possible justification For the Stackelberg equilibrium is concern for reputation, 
if the large trader transacts on this market repeatedly. But as we explain in Laffont and 
Maskin (1987), such a reputation effect relies on quite sophisticated behavior on the 
part of small traders. 
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argument above and demonstrate in proposition 6 that pooling is 
optimal for the large trader provided that 0 does not vary too much. 
(Drawing on propositions 4 and 5, proposition 7 establishes the op¬ 
timality of separating equilibrium for a highly variable 0.) Through 
Section V, we concentrate on the case in which 0 reflects information 
about the asset's mean return. In Section VI, we briefly argue that our 
results carry through when 0 measures the variability of the return 
(proposition 8). We examine the welfare of the small traders in Sec¬ 
tion VII, where we establish that they too prefer pooling to separating 
equilibria (proposition 9) when 0 does not vary too much. Finally, we 
offer a few concluding comments in Section VIII. The Appendix 
provides the proofs of propositions 2-5 and 9. 


II. The Model 

We consider a two-period model with two assets: a safe asset (money) 
with (gross) return normalized to one and a risky asset with return 0 
+ i, where € is a zero-mean random variable with cumulative distri¬ 
bution function F and 0 is a random variable, independent of e, that 
takes on the values 0, and 0 2 (0i < 0 2 ) with probabilities ir, and ir 2 . 
Assets are traded in period 1 and € is realized in period 2. The 
variable 0 is realized in the first period (before trade), but its value is 
not known to all traders. 

There is a continuum of identical risk-averse traders of Lebesgue 
measure one who initially own all the risky asset but do not know the 
realization of 0 in the first period. In period 1, the endowment of the 
typical small trader comprises w () units of money and one unit of risky 
asset. His budget constraint in period 2 is therefore 

av + b = wo + v, (1). 

where money is the numeraire, v is the price of the risky asset, b is the 
final money holding, and a is the share of the risky assets that he 
keeps. His income in period 2 is therefore b + a(0 + e) or, from (1), 

w 0 + v + a(0 + i — v). 

If u(-) (u > 0, u" < 0) is the small trader’s von Neumann— 
Morgenstern utility function for second-period income, his expected 
utility in period 1 is 

■n,£u(u> 0 + v + a(0, + i - i-)) + n 2 Eu(w 0 + v + a(0 2 + € - v )) 
or, if we set 1/(0, a, v) = Eu(wq + v + a(0 + e — v)), 
iri£/(0i, a, v) + it 2 (/(0 2 , a, r). 

We shall assume that u exhibits nonincreasing absolute risk aversion. 
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In our model, there is also a risk-neutral large trader who ha? a 
period 1 endowment of one unit of money and; zero units of the risky 
asset. Unlike the small traders, he knows the realization of 8. If 8 = 8 
and he buys a share 3 of the risky asset, his expected paydff in period 
1 is 1 + 3(6 - v). 

After learning 6, the large trader chooses a quantity to buy. Price 
adjusts to equate supply and demand, that is, 3 = 1 — a. Equiva¬ 
lently, we can assume (and this is the approach we take here) that he 
chooses a price v, and so the small traders determine the quantity 
exchanged. In period 2 the realizations of 6 and e become public 
knowledge. 

A large-trader strategy is a mapping v: {6], 6 2 } —* R that prescribes a 
price v (0) on the basis of the trader’s private information 0. In princi¬ 
ple, v(-) can be a random function. We show in proposition 5, how¬ 
ever, that randomness is never optimal for the large trader. 

A small-trader strategy is a mapping a: R —» [0, 1] that represents the 
share of the risky asset retained for each price. Given the strict con¬ 
cavity of u, the optimal a given v will always be unique. 

Conditional beliefs for the typical small trader are represented by a 
mapping that associates to each price v a probability function g( j v) on 
{0i, 0 2 }, where g(01 w) is the probability that the small trader attaches to 
the value 0 given price v. 

III. Perfect Bayesian Equilibrium 

A perfect Bayesian equilibrium for our model is a pair of strategies (v( j, 
a( j) and a family of conditional beliefs g( j j such that (i) for all v in 
the range of v(-), g(-|t/) is the conditional probability of 0 obtained by 
updating the prior (tti, 7r 2 ), using u(-), in Bayesian fashion; (ii) for all 
v, a(v) E argmax„ i (/(0„ a, u)g(0,|v); and (iii) for all 0, v(0) 6 
argmax,, [1 - a(f)](0 - v). 

Condition i stipulates that small traders have rational expectations. 
Conditions ii and iii are simply the requirements that traders be op¬ 
timizing. Condition iii implies that 

[1 - a ( v (0))][0 - k (0)] 

( 2 ) 

> [1 - a(i/(0'))][0 - v(0')] for any 0, 0'. 

Perfect Bayesian equilibria satisfy the following two standard mono¬ 
tonicity properties. 

Proposition 1. In any PBE, a(j is nonincreasing on the set of 
prices that are charged in equilibrium and v( j is nondecreasing. 

Proof. Formula (2) can be rewritten as 

(1 - ot])(0i - «,) 2 (1 — a 2 )(0i - v s ). 


(3a) 
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(1 - a 2 )(0 2 - i>2) 2(1- oti)(9 2 - Vi), (3b) 

where v, is in the range of v(0,) and a, = a(r.) for i = 1,2. Adding (3a) 
and (3b) yields 

(a, - a 2 )(0 2 - 6,) 2 0. 

Therefore, 

a] 2 ot 2 . (4) 

Now suppose, contrary to the proposition, that Vj > v 2 . In view 
of (4), 

(1 - ot|)(0! - iq) < (1 - a 2 ){0i - v 2 ), 

contradicting (3a). Hence v(-) is nondecreasing and, from (4), a(-) is 
nonincreasing, as required. Q.E.D. 

In common with other signaling models, there is, in general, a 
considerable multiplicity of equilibria in our framework. To help 
characterize these equilibria, we first define d(v, 0) to be the fraction 
of the asset retained by small traders if price is v and they know that 0 
= 0. Similarly, let ot°(u, 0j, 0 2 ) be the fraction retained if the probabil¬ 
ity that 0 = 0, is ir,. Note that d and a° are well defined because of the 
concavity of the small traders’ utility function. We first show that they 
are increasing in 0. To obtain this result, we shall invoke our assump¬ 
tion of nonincreasing absolute risk aversion on the part of the small 
traders. Moreover, we shall assume an interior solution to the small 
traders’ problem. 

Lemma. a 2 (v, 0) > 0 and a 2 (v, 81 , 0 2 ) > 0 , where subscripts denote 
partial derivatives. 

Proof. a(v, 0) solves 

max Eu(w„ + v + a(0 + e - v)). 

O 


Hence, the first-order condition is 

£(0 + e - n)u'(w (l + v + a{v, 0)(0 + e - t;)) = (). (5) 

Differentiating (5) with respect to 0, we obtain 

d, = ~ E[u ' + «(9 + e - v)u" 1 

£(0 -1 +^V ' (6) 

The denominator of the right-hand side of (6) is negative because u" 
< 0. Because of nonincreasing absolute risk aversion, £(0 + 6- v ) u " 
2 0 (see Arrow 1971), and so the numerator is negative as well. 
Hence, with nonincreasing absolute risk aversion, & 2 (v, 0) > 0. The 

argument is similar for a 2 (u, 0,, 0 2 ). Q.E.D. 



EFFICIENT MARKET HYPOTHESIS 


77 


IV. Separating and Pooling Equilibria 

Using the lemma, we shall show that there always exists a completely 
separating equilibrium in our model. First define u*(0) to be the solu¬ 
tion of the program 

max(0 — r»)[ I - a(i\ 0)]. (7) 

V 

For convenience, we shall assume throughout that the solution to (7) 
is unique for all 0, so that k*( 0) is well defined. Take 

«*<«) = &<**<«), 0). (8) 

The pair (u*(0), a*(0)) are the equilibrium price and quantity that 
would obtain if the small traders had perfect information, that is, if 
they knew that 0 = 0. Notice that 

-[0 - i»*(0)]d,(«*(«), 8) - fi - A(v*(fl), 0)1 = 0, 

and so 


&,(u*(0), 0) < 0. 

Proposition 2. There exists a PBE ( v (-), a(-),g( | )) in which x>(0i) < 
v(0 2 ). Moreover, in this equilibrium, (f(0 2 ), a(0 2 )) = (v*(d 2 ), a*(0 2 )), 
and v(-) is a solution to the following program: 

max Y w/,(0, - r/,)[l —a(v„ 0,)] (9) 

subject to 

(02 - V 2 )[l - a(v 2 - 02)] ^ (02 - ^l)[l - 6(^1, 0l)] (10) 

for any positive weights and u> 2 . 

We relegate the formal proof of proposition 2 to the Appendix, but 
we can readily describe the construction of equilibrium. We take u(0 2 ) 
= u*(0 2 ) and let v(0i) be a v, that solves (7) subject to (10). Because 
these choices of v(0,) and v(0 2 ) are independent of w\ and u> 2 , they 
solve program (9)-(10) for any choice of weights. To complete the 
description of equilibrium, we take a(v,) = &(v it 0,), for i = 1,2; a(u) 
= a(u, 0 2 ), forv^ {v 1 ,u 2 };g(0,|v 1 ) = l,fori= 1, 2; and g(0 2 |u) = 1, 
for v ^ {ui, u 2 }. 

Notice that when 0 = 0 2 , the large trader is unconstrained by the 
informational imperfection: the price and allocation are the same as 
though the small traders could directly observe 0. When 0 = 6], 
however, v and a are restricted by the incentive constraint (10). 

The beliefs g(-|v) that we invoke in the proof of proposition 2 are 
extreme and discontinuous: for v = the small traders believe 

that 0 = 0j, but for an only slightly different price they believe that 0 
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0 a(9 2 ) = a*(8 2 ) o(6,) 1 at 


Fic. 1 


= 02- Using the methods of Laffont and Maskin (1987), however, we 
could devise alternative beliefs that are continuous and monotonic 
in v. 

Figure 1 illustrates the separating equilibrium of proposition 2 in a- 
/ space, where / is the amount of a small trader’s wealth invested in 
the safe asset: I = w {) + i>(1 - a). 2 The curve 0(0) is a small trader’s 
offer curve when he believes that 9 = 9. It is obtained by maximizing 
expected utility Eu{I + a(9 + e)) subject to the budget constraint / + 
u(0)a = wq + u(0). The large trader maximizes his linear objective 
function 0(1 - a) - / (whose indifference curves have slope — 0 if 0 
* 9) over the offer curve (subject to the incentive constraint [10]). 
Denote the solution by 1S(0). Then <g(0) is the equilibrium allocation 
when § = 0. As we noted above, 1(02) is constrained only by the offer 
curve 0 ( 02 ), where % (8i) not only must lie on O(0j) but must satisfy the 

s We thank a referee for suggesting this graphical illustration. 
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restriction that if 0 = 0 2 , the large trader does not prefer iS(6i) to 
iS(0 2 ). 

We next demonstrate that, at the opposite end of the spectrum, a 
pooling equilibrium always exists when 0| is not too far from 0 2 . 

Proposition 3. For 0j sufficiently near 0 2 , there exists a PBE (u(-), 
<*(•). 4f( - l - )) in which v<0j) = v(0 2 ) = u°(0], 0 2 ), where v = v°(0], 0 2 ) 
solves the program 

max ^ 11 ,( 6 , - u)[l - a°(v, 0 ,, 0 2 )]. ( 11 ) 

V i 

Proof. See the Appendix. 

The pooling equilibrium of proposition 3 is illustrated in figure 2. 
The small traders’ offer curve 0(Ed) is obtained by maximizing 
tiiEu{I + a(0i + e)) + tt 2 Eu(J + a(0 2 + i)) subject to the budget 
constraint/ + u(0)a = wo + v(0). The equilibrium allocation %(E§) is 
obtained by choosing the allocation on G(£&) that maximizes the large 
trader’s expected utility (his indifference curves have slope -£§). 


JOURNAL OF POLITICAL ECONOMY 

The requirement in proposition 3 that 0i and 62 not be too far apart 
is needed to ensure that the large trader has no incentive to set v * 
v °( 9 h fl 2 ). To see that this constraint can be a problem, we next note 
that a pooling equilibrium may fail to exist if 0 2 >s too big relative to 6 t . 

Proposition 4. Suppose that xu (x) — 00 and ot (0) < 1 for all 

0. Given 0„ there exists S 2 such that there is no pooling equilibrium 

for 02 > S 2 . 

If the large trader is willing to buy when 0 = 0 j, then an equilib¬ 
rium pooling price v must satisfy v £ 0i- But given such a low price, 
the small traders will refuse to sell if 0 2 is big enough. Thus for such 
0 2 , the large trader buys nothing in a pooling equilibrium. But be¬ 
cause a*( 0 2 ) < 1 , the large trader can obtain a positive payoff by 
setting price t<*( 0 2 ), a contradiction. The formal proof can be found in 
the Appendix. 


V. The Large Trader’s Favorite Equilibrium 

When a pooling equilibrium exists, there is also a continuum of 
“semiseparating” equilibria (where either t/( 0 |) or u( 0 2 ) is a random 
variable) intermediate between complete pooling and complete 
separating (there are also pooling and separating equilibria other 
than the ones we constructed in propositions 2 and 3). We argued in 
the Introduction, however, that by virtue of his market power, the 
large trader ought to lie able to influence other traders' beliefs and so 
ensure a favorable equilibrium for himself. 

The simplest way that he can accomplish this is, before learning the 
value of 0 , to make a public announcement that he will play his favor¬ 
ite equilibrium. If the small traders believe him, they can do no-better 
than play their corresponding equilibrium strategies. Moreover, it 
makes sense for them to believe him since, if they do, he has no 
incentive to deviate. 

Ol course, such a public pronouncement may be implausible or 
impossible. In Laffont and Maskin (1987), we consider another way of 
formalizing the large trader’s influence over beliefs. Suppose that the 
market is repealed many times (where 0 is drawn independently each 
time), that small traders have prior beliefs about the statistical rela¬ 
tionship between 0 and v , and that they update these beliefs on the 
basis of their previous experience (under the hypothesis that the joint 
distribution between 0 and v is stationary). Suppose that the large 
trader s discount rate is near zero (which is equivalent to supposing 
that he transacts frequently). If he chooses prices over time to max¬ 
imize his expected discounted sum of payoffs and if, as a result, 
buyers’ beliefs converge over time, behavior ultimately closely ap¬ 
proximates that in the PBE (of the one-shot model) that maximizes 
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the large trader’s ex ante expected payoff , ‘rr.fO, - u,)(I - a,). (In! 
Laffont and Maskin [1987], we also examine a more game-theoretic 
foundation for this equilibrium.) 

A quite different justification for focusing on the large trader’s 
favorite PBE derives from the welfare analysis in Section VII below. It 
turns out that provided that 8 does not vary too much, the pooling 
equilibrium of proposition 3 not only is the large trader’s favorite 
PBE, but is preferred by the small traders to any separating equilib¬ 
rium. Thus Pareto dominance favors pooling. 

Henceforth, we shall simply assume that the large trader is able to 
attain his ex ante best PBE (BPBE) and shall study the properties of 
this equilibrium. We first observe that we may as well assume that the 
BPBE is deterministic (i.e., either a pure separating or a pure pooling 
equilibrium). 

Proposition 5. There exists a BPBE (i/(-), <*(•), g (• I •)) in which v(- ) is 
deterministic. 

If ( v (-), a(-), g(-|-)) is 3 BPBE in which t/(0|) and 1 /( 62 ) are random, 
choose v, e support of t/( 0 ,), i = 1 , 2 , so that v x < max{support of 
i/(0,)} s min{support of 1 /( 62 )} < v?. Then we can construct an alterna¬ 
tive PBE (f/(-), &(•), g(-|-)) in which t/(0,) = v„ i = 1, 2. (The large 
trader’s incentive constraints will be satisfied since they hold for the 
original PBE.) Clearly, the large trader’s payoff, for each value of 0„ is 
the same as before, and so the new PBE is a deterministic BPBE. For 
the details and the other cases (where one of t/( 0 ,) and i/( 0 2 ) is not 
random), see the Appendix. 

We next turn to our main result: the observation that the large 
trader .profits from concealing his information if 0 , and 0 2 are not too 
far apart. 

Proposition 6 . If 0, and 0 2 are sufficiently close, the BPBE is the 
pooling equilibrium that solves program ( 11 ). 

Proof. From proposition 3, program (11) defines a pooling equilib¬ 
rium that exists for 0, near enough 0 2 , and, by virtue of proposition 5, 
we need compare it only with the large trader’s favorite separating 
equilibrium. From proposition 2, this latter equilibrium solves the 
program 


max T 11 ,( 0 , - i/,)[l — dt(i/„ 0,)] 

(Kl.l'j) j 

subject to 


( 12 ) 


(02 - z/ 2 )[l - d(t/ 2 , 02)] S (02 - l/])[l - &(v u 0i)J. (13) 

Clearly, the solution (t/(0i), i/(0 2 )) to the program (12)—(13) satisfies 
t/(0 2 ) = t/*( 0 2 ). We first show that, for 0, near 0 2 (but 0, # 0 2 ), t/(0i) ** v 
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i>*(0i). Applying the envelope theorem, we have 

- r*(6i)][l - d(w*(«,),ei)]U 1 -9 1 

dd\ 

= -[©, - v*{o,)]« 2 (^*(»i). e,), 

which is negative because & 2 > 0. Thus if u(0)) = u*(0j), (13) is con¬ 
tradicted. Thus t/(0j) * u*(0,), as claimed. 

Now for0, near 0 2 , u, = u*(0i) violates (13) but v(0i) is near t/*(0,). 
Hence, r(0i) must satisfy (13) with equality. But (13) is violated for all 
v, between i'*(8i) and r*(0 2 ), and, from proposition 1, u(0i) < u*(0 2 ). 
Hence i/(0 ( ) is the largest price less than v*(0 t ) such that (13) holds 
with equality. 

The derivative of the large trader’s payoff in the separating equilib¬ 
rium with respect to 0i is 


" f(0.)][l " A( V (0,), 0,)] 
= - ^(Oi)][i - &M0,). »,)]. 


(14) 


The right-hand side of (14) can be rewritten as 


®i(j ~ 6 W 0 <). »i) ~ {[», - v(d,)]^- + (1 - 
Because (13) is binding, 

^-[0 2 - w(0i)][i - &(»(«,), e,)] = o. 

Hence 


(15) 


(16) 


+ -4)^-0. (17) 

But at 0[ — 0 2 , the left-hand side of (17) and the expression in braces 
in (15) are the same. Therefore, from (14), (15), and (17), 

d V 

^t 0 . - "Mi - awe,), e,)] 

1 i 

(18) 

= *,[1 - d(u(e,), ©,)]. 

Applying the envelope theorem to the solution v o (0„ 0 2 ) to pro¬ 
gram (11), we fiirtfthat the derivative of the large trader’s payoff with 
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respect to 0j is 

1 '' 

^-2 ir,[0, - y°(0„ e 2 )][l - aV(8„ 0 2 ), 0,, 0 2 )] 

= TT,[1 - «V(8,, 0 2 ), 0„ 0 2 )J (19) 

- £ n f [e, - i/ 0 (8„ 0 2 ))«§(« o (e,, e 2 ), e„ o 2 ). 
« 

But because a 2 > 0, the right-hand side of (19) is less than that of (18). 
Hence, because the pooling and separating equilibria generate the 
same payoffs when 0i = 0 2 , the former yields the large trader the 
higher payoff for 0j near (but strictly less than) 0 2 . Q.E.D. 

We provided a rough heuristic explanation for proposition 6 in the 
Introduction. Let us give a somewhat more rigorous but still intuitive 
account to supplement the formal proof. To see that the “downward” 
incentive constraint (13) must be binding in a separating equilibrium 
when 0! is near 0 2 , suppose that it is not. Then for i = 1,2, i>(0,) = 
u*(0,); that is, equilibrium prices are the same as they would be if the 
small traders knew the realization of 0. Now if 0 = 0 2 , the effect on the 
large trader’s payoff of setting v = u*(0,) rather than v = u*(0 2 ) is 
zero to the first order because the latter price maximizes (0 2 - t»)[l - 
a(v, 0 2 )] and v*(@i) is near u*(0 2 ). However, there is also an indirect 
effect that arises because a(v, 0 t ) < &( v, 0 2 ): by setting v*(0j), the large 
trader makes the others believe that 9 = 0j and so lowers the propor¬ 
tion of the risky asset that they retain. Combining these two effects, 
we see that the large trader is better off setting v = i/*(0i) when 0 = 
0 2 , ah impossibility. We thus conclude that (13) must be binding. 

Now in the separating equilibrium (v(-), a (-), £■(•)•)). consider what 
happens to the large trader’s expected payoff as the value B ( in¬ 
creases. There are three effects: a direct increase at rate n^l — 
a(v(0i), @i)], an indirect decrease due to -a(v, 0)) falling (at rate 

— a 2 (u, 0j)), and another indirect increase due to w(0j) adjusting to 
keep (13) an equality: the closer 0i is to 0 2 , the less v(0[) has to deviate 
from v*(0i) in order to maintain (13), and so the greater is (0i — vi)( 1 

— a). In the pooling equilibrium, by contrast, the first two effects 
persist, but the third is not present since there is no incentive con¬ 
straint corresponding to (13). Hence the rate of increase with 0i of the 
large trader’s payoff is slower in the pooling than in a separating 
equilibrium. Since payoffs from the two equilibria are the same when 
0i = 0 2 , therefore, the pooling payoff must be higher when 0) < 0 2 . 

The pooling and separating equilibria are compared in figure 3. In 
the figure, the large trader’s expected utility from the separating 



JOURNAL OF POLITICAL ECONOMY 

84 



equilibrium corresponds to ?( 8 |) because, whether 8 = 0 | or 9 = 82 , 
the allocation ?>(0i) gives rise to maximal utility. Therefore, the pool¬ 
ing equilibrium dominates the separating equilibrium since the line 
with slope - £0 through 1 (£ 0 ) is below that through ^( 8 i) (lower 
indifference curves denote higher utility ). 8 

For 0 2 big enough relative to 0J, a pooling equilibrium may fail to 
exist, as we noted in proposition 4. In such a case, the seller’s favorite 
PBE is separating. 

Proposition 7. Suppose that lim,^* xu'(x) = 00 . For 8 2 sufficiently 
big relative to 81 , the BPBE is the separating equilibrium defined by 
proposition 2 . 

Proof, From proposition 5, the BPBE is deterministic, that is, either 
completely separating or pooling. From proposition 4, a pooling equi¬ 
librium does not exist. Therefore, the BPBE must be the “best” 

5 A referee pointed'Oul that, in general, a pooling equilibrium will dominate if it j is 
big enough, as fig. 3 makes clear. This observation complements proposition 6. 
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separating equilibrium, that is, the one defined by program (9)-(10) 
with Wi = it;. 4 Q.E.D ' ' . 

VI. Private Information about Variability 

So far we have modeled the large trader’s private information as 
knowledge about the mean return of the risky asset. For this section 
only, let us assume that the mean return is I and that private informa¬ 
tion parameter 6 instead reflects the variability of the asset. Specifi¬ 
cally, suppose that the distribution when 6 = 0j is a mean-preserving 
spread (cf. Rothschild and Stiglitz 1970) of that when 8 = 0 2 . Then if 
&(v, 0) and a°(t/, 0) are defined as in Section III, we conclude, by virtue 
of the small trader’s risk aversion, that &2 and a 2 are positive. 

In this setting, the large trader’s payoff is 

(e - t>)(l - ol) (20) 

independently of the realization of 6. Recall that in the informal dis¬ 
cussion following proposition 6, there were three effects on the large 
trader’s payoff from raising 0[ in a separating equilibrium: (i) the 
direct effect, (ii) the effect on a, and (iii) the effect on v. We noted that 
in a pooling equilibrium, the third effect vanishes but i and ii remain. 
When 0i is replaced by I as in (20), effect i vanishes in both the 
separating and pooling equilibria. But since we are subtracting the 
same effect from both equilibria, the large trader’s payoff continues 
to increase faster in the separating equilibrium as a function of 0!. 
Thus he continues to favor the pooling equilibrium. 

Proposition 8. In the variability model, the BPBE is a pooling 
equilibrium if d, and 0 2 are sufficiently close. 

Proof. Parallels that of proposition 6. 

Just as proposition 6 continues to go through in the variability 
setting, so do propositions 2, 3, and 5. 

VII. Welfare 

We have concentrated until now on the large trader’s welfare. Let us 
broaden our scope and consider the small traders’ welfare as well. 

It is sometimes asserted that insider trading is undesirable because 
insiders profit at the expense of small traders. Often implicit in such 
assertions, however, is the assumption not only that small traders are 
worse informed but that they are irrational. When traders have ra¬ 
tional expectations, it is not at all clear that they can be “exploited” by 
insiders. Indeed, in our simple model, they clearly benefit from the 

4 Actually, however, the same solution results regardless of the weights w„ as we 
remarked following proposition 2. 
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presence of the large trader. Were he prohibited from trading, they 
would be stuck holding all the risky asset themselves. 

Moreover, although, as we have seen, the large trader may find it 
desirable to conceal his private information, this concealment need 
not work to the disadvantage of the small traders. As we shall show, 
the small traders prefer the pooling equilibrium of proposition 3 to 
any separating equilibrium provided that the variability of 0 is not too 
great. 

Proposition 9. If 9j and fl 2 are not too far apart, the small traders’ 
ex ante utility is higher in the pooling equilibrium defined by pro¬ 
gram (17) than in any separating equilibrium. 

As was true for the large trader (see the intuitive argument follow¬ 
ing proposition 6), the small traders’ welfare differs to the first order 
between a pooling equilibrium and a separating equilibrium only be¬ 
cause, in the latter, the price u(0i) must be chosen to satisfy the incen¬ 
tive constraint (13). But this constraint implies that u(0|) must be less 
than the “full information” level d*(0i). Since lower prices hurt small 
traders, we conclude that they are worse off in the separating equilib¬ 
rium. This argument is formalized in the Appendix. 

Remark .—Although the traders are unanimous in preferring the 
pooling equilibrium of program (11) to any separating equilibrium, 
they do not agree on which pooling equilibrium is their favorite. The 
large trader, of course, likes the program (11) equilibrium best. The 
small traders, however, prefer the equilibrium with the highest possi¬ 
ble price. 

It is interesting to note that the effect on welfare of 0 being private 
information is ambiguous, for both large and small traders. Formula 
(19) represents the derivative of the large trader’s payoff with respect 
to 0i when the value of 0 is private information (provided that the 
variability of 0 is not too great). One can readily show that the corre¬ 
sponding derivative when 0 is common knowledge is exactly the same. 
Thus the large trader’s payoff does not depend to the first order on 
whether or not 0 is private information. Moreover, the second-order 
comparison depends on the second derivatives of a° and a, which are 
not signable without additional assumptions on preferences. The 
analysis is similar for the small traders. 

Thus our model exemplifies the principle that two market imper¬ 
fections can be better than one. Introducing an informational asym¬ 
metry in the model may actually make all traders better off. 


VIII* Conclusion 

In competitive asset markets in which traders have rational expecta¬ 
tions, prices are likely (with some qualifications) to reflect all relevant 
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private information about the value of the asset. Indeed, if the large 
trader in the model of Section II were replaced by a continuum dC 
price-taking, risk-neutral small traders with private information 6, the 
market price i»(0) would just be 0, and so would reveal all information. 

We have argued, however, that in a model in which private infor¬ 
mation is possessed by a trader who is big enough to affect prices, the 
information efficiency of prices breaks down. Our model is very sim¬ 
ple, but the intuition for why the large trader gains from concealing 
his information seems quite general. We believe, therefore, that the 
same sort of reasoning ought to apply to more elaborate market struc¬ 
tures. 

Our welfare conclusions are less likely to generalize. We noted in 
our model that, not only the large trader but the small trader too may 
prefer a pooling equilibrium to any separating equilibrium (proposi¬ 
tion 9). This result, however, seems to depend on the small traders 
being on the opposite side of the market from the large trader. If 
there were also small traders buying the asset, then they would benefit 
from the fact that v(0i) must be comparatively low (see the discussion 
following proposition 9) in a separating equilibrium and so might 
prefer such an equilibrium. 


Appendix 

Proof of Proposition 2 

In (10), set vj> = v*(0 2 ). Then a(v 2 . 82 ) = a*( 8 2 ). Clearly, a(v, 0)) = 1 For v low 
enough (possibly — ®), Hence, there is a choice of v t for which (10) holds. 
From the lemma, o(v*( 02 ), 81 ) < a(t/*(0 2 ), 0 2 ) = a*( 8 2 ). Hence (10) is violated 
when i/j = v*(0 2 ). By continuity, there exists v°t < v*(0 2 ) such that (10) holds 
with equality when v, = v°. Thus from (10), a*( 8 2 ) < &(vj, 8 ]), and so 

[8, - t»*(0 2 )J[l - «*(8 2 )] < ( 81 - t/?)[l - &(i/?, 8|)]. (Al) 

Consider the program 

max (0! - vi)[l - &(vi, 8i)] (A2) 


subject to 

[02 - v*(0 2 )][l - a*(8 2 )] s (8 2 - v,)[l - a(v„ 0,)]- (A3) 

From the argument above, constraint (A3) can be satisfied, and so a solution 
vi = v(0 t ) exists. Define v(0 2 ) = v*(0j) and a(r,) = &(v(0,), 0,), » = 1, 2. 
Because (v(0i), a(v(0i))) satisfies (A3), (3b) is satisfied, and in view of (Al) and 
the fact that v(@!) solves (A2)—(A3), (3a) is satisfied. Because v(0 2 ) = v*(0 2 ) 
and v(0i) solves (A2)—(A3), v(-) solves the program (9)—(10) for any positive 
weights u>i and w 2 . For rational expectations, we must take g(6,| v(0,)) = 1 for i 
= 1,2. For v # v(0,), letg(02|v) = I, and so a(v) = &(v, 0 2 ). It remains to show 
that it does not pay the large trader to set v £ {v(0i), v(02)}. By definition of 
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[ 0 f - „( 0 f )][l - n(r< 0 2 ))] 2 (02 - «)[! " <*(», «*)] for 3,1 »• < A4 > 

Hence, we need only check that the large trader wilt not take v <$ M9i), v(0 2 )} 
when 0 = fli 

If v < t/(0,), then, by definition of t>(0|), 

[0, - v(6,)][l - ot(t/(6i))] > (0i - «)[• - ®( v > 02 )]. 
which, because d 2 > 0, implies that 

[0, - t'(0l)J[l - ot(t/(0i))] 2: (0, - u)[l - &(v, 0 2 )], (A5) 

as desired. If v > r(0|), then, because (A3) holds with equality, 

(0 2 - v(6,)][l - a(v(0,))] 2 : (0 2 - u)[ 1 - &(v, 0 2 )]. (A6) 

If 1 - a(i'(0|)) < 1 - a(v, 0 2 ), then (A6) and the fact that 0i < 0 2 imply that 
(A5) holds. If 1 - a(u(0|)) > 1 - &(v, 0 2 ), then the fact that 0 2 - i>(0 ( ) > 0 2 - 
v implies that (A5) holds again. Q.E.D. 


Proof of Proposition 3 

For convenience, we shall assume that the solution to (11) is unique, so that 
v°(0„ 0 2 ) is well defined. To construct the pooling equilibrium, take i»(0i) = 

t/(0 2 ) = u°(0|, 0 2 ), o(v°(0|, 0 2 )) = aV(ei. 02 ). 0 i,e 2 ).andg( 0 ,|v o ( 0 1 . 0 2 )) = tt„ 
i = 1,2. It remains to construct the equilibrium for out-of-equilibrium prices. 
For any price v (^ t/°(0|, 0 2 )), letg(0 2 |v) = 1. We must show that, given these 
beliefs, the large trader will never set such a price if 0i and 0 2 are close 
enough. Suppose first that 0 = 0 2 . If 0, = 0 2 , then the large trader will not 
gain by setting v * v°(0i, 0 2 ) since, in this case, t»°(0i, 0 2 ) = v*(0 2 ). Thus it 
suffices to show that, at 0i = 0 2 , the large trader’s payoff [0 2 - i/’(0i, 0 2 )][1 - 
a°(v°(0|, 0 2 ), 0], 0 2 )J is decreasing in 61 , because then 

(0 2 - t/°(0,. 0 2 )][i - aV< 0 ,, e 2 ), e,, e 2 )] 

> [0 2 - v*(0 2 )][l - a*(0 2 )] 2 (0 2 - W)[l - d(n, 0 2 )] for all v 
when 0! is near 0 2 . Now 


~ {[02 - i’°(0„ 0 2 )][I - aV(0., 0 2 ). 0i. 0 2 )]} 


(A7) 


= -(02 - w°K«M + « ! i) - (1 - a'V;. 

But at 0, = 0 2 , the first-order condition for the maximization in (11) is 
~(0 2 - t'V? - (1 - a °) = 0, 
and so the right-hand side of (A7) reduces to 

-(02 - u°)aS. (AS) 

Since a? > 0, (AS) is_negative at 0, = 0 2 , as required. 

Assume next that 0 = 0j. In this case, if the large tradei failed to set v = 

v . 0>- 02>- hls alternative is to choose v = v ,)O (0„ 0 2 ) that solves the 
following program: 


max (0, - v)[l - d(z), 0 2 )] 


since g'fOj j v) I for all v ^ r°(0], 0 2 ), Thus it suffices to show that his gain 
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from choosing t/° rather than i/ 00 is decreasing in 0j at 9 t = 0 2 , that is, 

-f- {[6, - «°(a„ e a ))[i - a°(i»°(8i, e,)! ©,, e 2 )] 

<fe, 

(A9) 

- [6, - v°°(9„e 2 )][i - dCv'^e,,e 2 ),e^e.-e, < 0 . 

But from the same application of the envelope theorem that we used in the 
previous paragraph, we can rewrite the left-hand side of (A9) as 

-(9, - r/°)a§ + 1 - ct° - (1 - a). 

Now a 0 = d at 9i = 0 2 , and a 2 > 0. Hence, (A9) holds, as required. Q.E.D. 


Proof of Proposition 4 

Suppose that there exists a pooling equilibrium in which u(0,) = v and a(u(0,)) 
= a, i = 1, 2. If a < 1, then 

•)ti£(0j + i — v)u'(w 0 + v + a(9| + e — v)) 

(A 10) 

+ it 2 £(0 2 + e — v)u'(w 0 + v + a(0 2 + e — v)) S 0. 

Clearly, 0 s v s 9,. Therefore, the left-hand side of (A 10) is no less than 

it, J eu'(u/o + e)d£(e) + tt| | eu’(u>o + 9j + t)dF(t) 

Jt<n J«-(i 

(All) 

+ ir 2 £(0 2 + i - 8i)«'(t"o + (1 - ot)6] + a( 0 2 + e)). 

The first two terms of (A11) are independent of 0 2 . Denote them by K. As 0 2 
tends to infinity, so does the third term, thanks to our hypothesis about u. 
Thus eventually the third term exceeds |A|, a contradiction of (A 10). Hence, 
a = 1, and so the large trader’s payoff is zero when 0 = 0 2 . But since a*(0 2 ) 
< 1 by hypothesis, he can obtain a positive payoff by setting v = v*(0 2 ), 
a contradiction. Q.E.D. 


Proof of Proposition 5 

Consider a BPBE (v(-), a(-), g(- 1■)), From proposition 1, 

max{v|u E support of v(8i)} ^ min{u|v E support of v(0 2 )}. (A12) 

Suppose first that (A 12) is strict. Then the small traders are perfectly in¬ 
formed about 0 by price. For i = 1,2, choose v, 6 support of i/(0,) such that 
(/(0„ a(v,), v,) {/(0„ a(v',), vl) for all v,' £ support of t>(8,). But then we can 

define a deterministic PBE (5(-), fi(-), gO |0) where, for i = 1,2,5(0,) = u, (and 
5(0 = <*(■) and g( ■ | •) = g (■) -)) in which the large trader is as well off as before 
since he is indifferent among all prices in the support of v(0 2 ). 

Suppose henceforth that (A 12) holds with equality and that both sides 
equal v. If v is not a mass point of v(0j) and d( 0 2 ), the argument of the 
preceding paragraph applies. Assume, therefore, that Pr{u(0,) = 5} > 0 for t 
= 1,2. 

Case 1: Only v(9i) Is Random 

In this case, v(0 2 ) = 5. Define 5(0 so that 5(0;) = 5 for i = 1,2. Define£(-)•) so 
that £(0,(5) = ir„ i = 1, 2, and g(-|v) = g(-)u) for v t 4 5, Finally, let 5(5) = 
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cfia 6,, fl 2 ) and a(v) = a(v)forv*v. Because g(« f)< “<”) 

from the lemma. Thus the large trader has no mcenuve to deviate from D, 
that is (v(‘) o( ), g( | )) is an equilibrium. Moreover, the large trader does 
better in this equilibrium than in the orginal equilibrium, regardless of the 
value of d. 


Case 2: v(0 2 ) Is Random 

Because g(0,|ii) < 1, the lemma implies that a(v) > 6(5, 0 t ). Hence, there 
exists v, < v such that [1 - &(t/,, 0 ,)](e, - tq) is the payoff that the large 
trader gets in equilibrium if 6 = 0i- Thus a(tq, 0i) > a(0]), and so 

(02 - t>)[l - a(5)] > (02 - i'i)[l “ &(wi. 0i)]- (A 13) 

Choose v s e support of v(0 2 ) with v 2 > v. Define (?>(•), a(-), g(-|-)) so that 
(u(0,), a(t5t).g(0il^i)) = (vi, d(ri, 0 t ), 1), 

(u(0 2 ), 5(t' 2 ).g(0 2 |i' 2 )) = 0>2. <*(i>2)> 1)> 

a(v) = &(v, 0 2 ) for v £ {i'i, v 2 }. 
f(0 2 |u) = 1 for v ? {x»|, u 2 ). 

This construction satisfies the large trader’s incentive constraints (condition 
iii of the definition of PBE) when 0 = 0| since his payoff is the same as in 
the original PBE and, thanks to the lemma and to the definition of g (• | •), a(u) 
a a(v) for all v # 5(0 j). Moreover, it also satisfies his incentive constraints 
when 0 = 0 2 because 

(02 - ^2)tl - S(v 2 )] = (0 2 - 5)[1 - 0(5)] 

> (02 - V|)[l - &(v,, 0l)3 = (0 2 - V,)[l - fi(iq)], 

where the two equations follow from construction and the inequality follows 
from (A13). Hence, (5(-)> <*(•), g(-)•)) is an equilibrium and, hence, a determin¬ 
istic BPBE. Q.E.D. 


Proof of Proposition 9 

We first observe that, in any separating equilibrium (v(-), a(-), g(-|-», u(0 2 ) = 
i>*(0 2 ). This follows since v = v*(0 2 ) maximizes (0 2 - i/)[l - &(v, 0 2 )] and, 
from the large trader’s standpoint, the most disadvantageous beliefs that the 
small traders could have are g($ 2 1 v) = 1 . 

From the proof of proposition 6, the separating equilibrium (v*(-), «*(•)> 
g*(-|-)) that solves program (12)—(13) satisfies (13) with equality when 0j is 
near enough 0 2 . Moreover, as the proof shows, (13) is violated for (iq, v 2 ) such 
that v 2 = v*(0 2 ) = v*(0 2 ) and v*(0,) <«,< r*(0 2 ) = v*(0 2 ). Hence, for 0, 
near 0 2 , any separating equilibrium (v(-), a(-), g(-| •)) must satisfy u 2 (0 2 ) = 
w *(® 2 ) snd Vi(0i) s 5*(0i). But for a small trader, a higher price is unambigu¬ 
ously better than a lower price. Thus (t>*( ), <**(•), g*( -|-)) is the small traders’ 
favorite separating equilibrium, and it suffices to show that they consider this 
equilibrium inferior to the pooling equilibrium defined by (11). 

When 0] = 0 2 , the small traders are, of course, indifferent between the two 
equilibria. Thus we need demonstrate only that the derivative of the small 
traders' payoff with respect to 0j at 0] = 0 2 is greater for the separating than 
for the pooling equilibrium. For the separating equilibrium (v*( ), «*(■), 
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«•*(•! )). we have 

y tt,Eu(wo + i;*(fl.) + re, + € - t»*(9,)]«i(i/*{e 1 ), e,)) 

«“l t 


= 'ir 1 £u 


'[{1 - & + te, 


+ * - ^*<e 1 )]& 1 } 


dv*(B x ) 

de x 


(A14) 


+ («,+€- J /*(0 1 )]a 2 ( V *(6 1 ),6 1 ) + d(t;*(0,),0,)l. 
Because (13) holds with equality, 


and so 


^[0 2 - v*(0,)][i - &(r*(e.), 0,)] = o, 


{1 - a + [0 2 - v^OOJd,}^ + [0 2 - u*(0,)]d 2 = 0. (A 15) 


Now x/*(0 2 ) maximizes (0 2 — i/)[l - a(v, 0 2 )]. Thus 

[0 2 - u*(0 2 )]di(v*(0 2 ), 0 2 ) + 1 - d(u*(0 2 ). 02) = 0. (A 16) 

But at 0! = 0 2 , v*(0|) = i/*(0 2 ), and so from (A16), the coefficient of dv*/dQ x 
in (A15) is zero. Thus from (A15), 


rfi/*(0i) 

<* 0 , 

From (A 15), the right-hand side of (A14) simplifies at 0, = 0 2 to 

dv* 


tt\Eu' 


€d, + €d 2 + d 

do i 


(A 17) 


(A 18) 


From (A 16), di < 0. Hence, in view of (A 17) and because feu’ < 0, (A 18) is 
and so the derivative of the small traders’ payoff is infinite. 

By contrast, the derivative of the small traders' payoff at 0) = 0 2 in the 
pooling equilibrium is 


— Y-n.Euiv^u 0 2 ) + [0, + i - i>°(0,, 0 2 )]aV(0., 0 2 ), 8,, 0 2 )) 

M 1 i 


= y ir,Eu r ["(1 - a 0 ) + TT\Eu'a°, 

, L 5 ®i. 


(A19) 


where i/°(0], 0 2 ) maximizes 2, 21,(0, — f)[ 1 — a°(v t 0), 0 2 )], The first-order 
condition for this maximization is 

^ 2T,((0, - v°)a? + 1 - a 0 ] = 0. (A20) 

t 

Implicitly differentiating (A20) with respect to ©i, we obtain 

X *i|f(8, ~ v°)a?i - 2a?] + (0, - u°)a? 2 - a?J + n,a1 = 0, 
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and hence 

- it , a ” - ^ “ p °)“?2 “ «“] 

df_ _ _1_(A21) 

501 Y - A*Vi - 2 «'l 

But (A21) is clearly finite, and thus so is (A 19). Hence the derivative for the 
separating equilibrium is bigger, as required. Q.E.D 
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In 1961, Vickrey showed that, in an independent private-values con¬ 
text with symmetric risk-neutral bidders, sealed second-price auc¬ 
tions have dominant truth-revealing equilibrium strategies, that they 
are perfectly efficient economically, and that they produce the same 
expected revenue for bid takers as equilibrium strategies in oral 
progressive auctions, Dutch auctions, or standard, first-price sealed 
bidding. Yet sealed second-price auctions seldom occur. We argue 
that fear of cheating and especially disincentives for bidders to fol¬ 
low truth-revealing strategies are important explanations. We moder 
auctions in which third parties capture a fraction of the economic 
rent revealed by the second-price procedure. 


Introduction 

Over a quarter of a century ago, Columbia University economics pro¬ 
fessor William Vickrey (1961) analyzed and compared four kinds of 
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auctions. Three of them, oral progressive auctions, standard first- 
price sealed bidding, and Dutch auctions, were in common use. The 
fourth was a sealed second-price procedure (now sometimes called 
"Vickrey auctions”) that he devised in order to have a sealed-bid 
procedure that, in some ways, was “logically isomorphic” to oral auc¬ 
tions. 1 In this procedure, the best bid would win, but the payment to 
or by its maker would be the amount of the best losing bid. His 
analysis showed that in an independent private-values model with 
risk-neutral bidders, such sealed second-price procedures have sev¬ 
eral desirable properties. 

First of all, in this context the equilibrium strategies are truth- 
revealing. That is, the equilibrium strategy is that the bidder bids his 
or her true cost or value. In addition, these truth-revealing strategies 
are not only equilibrium strategies, they are dominant strategies. That 
is, it is optimal for a bidder to follow the truth-revealing strategy even 
if he or she assigns a positive probability to the possibility that his or 
her competitors will deviate from their equilibrium strategies. Truth- 
revealing strategies simplify bid preparation. Because they are domi¬ 
nant strategies, they do not require the gathering or analysis of any 
information about the situation or intentions of competitors. 

Second, at equilibrium the auction always leads to complete eco¬ 
nomic efficiency. The bidder with the highest value or the lowest cost 
always wins. There is no chance that a bidder with a higher value will 
misestimate the level of competition and lose the auction to a bidder 
with a lower value. 

Third, Vickrey showed, in a result that has since been generalized 
by Myerson (1981), that if the bidders are symmetric, that is, draw 
their independent private values from the same distribution, the ex¬ 
pected revenue to the bid taker in all four kinds of auctions is the 
same with equilibrium behavior. 

Vickrey’s paper has led to a great deal of research by theoretical 
economists. McAfee and McMillan (1987) have recently written an 
able and extensive summary of the results of this research in the 
Journal of Economic Literature. In it, they write that “William Vickrey’s 
remarkable 1961 paper, two decades ahead of its time, is still worth 
reading as an introduction to the analysis of auctions” (p. 701, n. 2). 
Yet in spite of the attention this paper has received since 1961 and in 
spite of the advantages of sealed second-price procedures that Vick¬ 
rey found, such procedures remain extremely rare, a fact discussed in 
more detail in the next section. 


1 Given the results discussed here, this "isomorphism" was less complete than Vickrey 
had hoped. 
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This paper inquires into the reasons for the infrequent use of Vick¬ 
rey's proposed sealed second-price auctions. This concern is not just 
the result of intellectual curiosity, but arose in the course of attempt¬ 
ing to aid government agencies to design an effective auction mecha¬ 
nism for the purchase by electric utilities of power from cogenerators 
and small renewable power sources (Rothkopf et al. 1987). 

In this paper, we consider seven different possible reasons why 
Vickrey auctions might be rare. First, we argue that five of them are 
unpersuasive. One concern, considered by Vickrey himself, arises 
when multiple objects are to be sold in a single auction (as, e.g., with 
Treasury bills) and a single bidder wishes to bid on more than one 
item. Another concern, also considered by Vickrey, is the breakdown 
of the revenue equivalence result (but not the dominance of truth- 
revealing strategies or of the economic efficiency result) in the face of 
asymmetry among bidders. Two other concerns, the effect of bidder 
risk aversion and the presence of nonindependent information, have 
been discussed extensively in the literature that followed Vickrey’s 
paper. We also consider the possible role of inertia. 

Next, we consider two concerns that we do believe to be important 
explanations of the scarcity of Vickrey auctions. One of these is con¬ 
cern about cheating, a factor discussed but not regarded as critical by 
Vickrey. The other, which we believe to be new to the literature, is 
bidder reluctance, for both behavioral reasons and economic reasons 
related to subsequent transactions, to use truth-revealing strategies. 

In order to study the economic disincentives for truth-revealing 
strategies, we embed a simple auction model in a context in which a 
fraction of the revealed economic rent of a second-price auction is 
captured by third parties. In this model, the presence of-the loss of 
rent to third parties results in an adjustment to the equilibrium bid¬ 
ding strategies. 1 he effect of this revision, on the average, is to pass 
on to the bid taker all the loss of rent to third parties. 

This result in our simple model is not driven by its simplicity. We 
point out that the logic of Myerson’s (1981) powerful revenue equiva¬ 
lence theorem properly applied in the context of partial or total third- 
party capture of revealed economic rent requires that in any indepen¬ 
dent private-values model the average cost of that capture always be 
passed on to the bid taker at equilibrium. The revenue that is "equiva¬ 
lent is the combined revenue of the bid taker and the third parties. 

Before concluding this paper, we discuss the implications of our 
work for the kind of research needed to support the application of 
economic theory to the design of market mechanisms. We argue that 
the low theory task of including important factors in auction design 
models merits attention. 
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Reports of the Use of Vickrey Auctions . 

In addition to the casual observations of people interested in auction 
theory, there are other reasons for believing that sealed second-price 
bidding is rare. In the introduction to his 1961 paper, Vickrey refers 
to “modification of current practices” in single-item sealed-bid auc¬ 
tions and “departures from currently prevalent practices” in sealed- 
bid sales of multiple identical items. Nowhere in the paper does he 
indicate an awareness of any sealed-bid sales in which the award is at 
the price of the best losing bid. In section 5 of his paper, in which he 
discusses the sale of multiple identical items, he does refer to and 
criticize an “alternative method” (to the usual first-price method) in 
which the price for all successful bidders is set at the level of the worst 
bid accepted. (Where there are many finely graduated bids, this pro¬ 
cedure may in fact approximate closely Vickrey’s preferred best- 
rejected-bid nondiscriminatory procedure.) 

While Cassady’s (1967) extensive survey of the usage of various 
kinds of auctions concentrates on oral procedures, it does discuss 
sealed bidding but does not mention any sealed second-price proce¬ 
dures. Finally, many current sealed-bid auctions of economic impor¬ 
tance are first-price auctions. This includes federal oil lease and coal 
lease auctions, federal timber sales when done by sealed bid, sales of 
federal debt including Treasury bills, and a wide variety of transac¬ 
tions by the California State Lands Commission and the federal Gen¬ 
eral Services Administration. 

There are, however, some auctions that are essentially sealed 
second-price auctions. In some auctions of collectible items such as 
stamps and autographs, the auction involves both mailed-in sealed 
bids based on a catalog listing as well as oral bids. In at least some of 
these, the mailed-in sealed bids are explicitly upper limits to which the 
auctioneer, acting as the mail bidder’s agent, may advance the bid, 
rather than a “standard” first-price bid. (For example, in a catalog for 
an auction held on October 10, 1987, Stampazine states the following 
terms of sale: “Each bid is executed at the indicated advance over the 
next lowest bid or the starting bid.”) In addition, at least one seller of 
collectible items, Americana Arts Auction, of Denmark, Maine, con¬ 
ducts sales by catalog with second-price sealed bids, and the only oral 
bids are by telephone. 

In 1973 and 1974, under Secretary George Shultz (who has an 
economics degree), the U.S. Treasury experimented with nondis¬ 
criminatory sealed bidding in seven sales of Treasury bonds. An im¬ 
portant motive for using nondiscriminatory auctions was to attract 
into the competition relatively small buyers who normally purchased 
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Treasury bonds at a small markup from auction winners. Apparently, 
this did not occur. The use of nondiscriminatory auctions was aban¬ 
doned after Shultz was replaced by William Simon and has not been 
resumed since. However, bidders interested in small quantities of 
Treasury bonds may now submit “noncompetitive bids.” These bids 
are filled at the average price obtained for the bonds sold competi¬ 
tively. Most bonds sold in each sale are sold competitively. 

There have also been some experiments with other financial instru¬ 
ments and stock repurchasing. Recently, some companies have of¬ 
fered to repurchase their stock using a second-price procedure 
(Jacobs [1988]; also Merrill Lynch in a report on October 3, 1988, to 
clients for whom it holds stock in Schlumberger Ltd.). At least one 
preferred stock is subject to periodic two-sided, second-price auctions 
in which the rate of the next dividend is to be set at a market-clearing 
value (Tucson Electric Power Co. on October 8, 1986, in a certificate 
required by Arizona law setting forth the terms). 

Recently, the California Public Utilities Commission considered the 
form of auctions to be used in the future for the purchase of electric 
power under long-term contracts by California utilities from cogener¬ 
ators and small power producers qualifying under the federal Public 
Utilities Regulatory Policy Act of 1978. They selected the sealed sec¬ 
ond-price auction format (decision no. 86-07-004, July 2, 1986) after 
hearing arguments based on those of Vickrey by John Jurewitz and G. 
Vail. However, since California utilities became committed to a large 
quantity of cogeneration power under a previous posted price proce¬ 
dure, no auctions have yet been held in California. In addition, auc¬ 
tions for similar purposes held in Maine and in Massachusetts have 
been first-price auctions. 

Aside from the few examples mentioned here, we are not aware of 
any use of sealed second-price auctions. 


Five Nonreasons for the Rarity of 
Vickrey Auctions 

We have identified seven different potential reasons to account for 
the fact that Vickrey auctions are unusual. In this section, we discuss 
and reject five of them that we find unconvincing. 

Multiple Objects for Sale 

One potential objection to the use of sealed second-price auctions is 
that two of the[r desirable properties—they have dominant truth- 
revealing strategies and they are economically efficient—both break 
down if there are multiple items involved in the auction and if any 
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bidder wishes to bid for more than one item. This concern was dis¬ 
cussed by Vickrey himself as well as by Dubey and Shubik (1980). 
Dubey and Shubik have specified a modification of the sealed second- 
price procedure that restores the truth-revealing nature of optimum 
strategies. However, the modification amounts to little more than an 
explicit recognition of the market power of bidders interested in 
more than one item, and, to our knowledge, it has not been tried. In 
particular, it has not been implemented in the trial Treasury bill 
auctions and in the proposed California cogeneration auctions in 
spite of the interest of bidders in making multiple bids. 

More fundamentally, we are convinced that it is not the key reason 
for the scarcity of Vickrey auctions because, if it were, we would 
expect to see many Vickrey auctions in which only single items are 
sold and, hence, in which it can be of no force. 


Bidder Risk Aversion 

Vickrey’s results depend on his assumption that bidders are expected 
profit maximizers. It is now well established that, in an independent 
private-values model with risk-averse bidders, the bid taker can ex¬ 
pect more revenue with a first-price auction than with a second-price 
auction (Holt 1980; Harris and Raviv 1981; Riley and Samuelson 
1981; Maskin and Riley 1984; McAfee and McMillan 1987). Could 
this account for the rarity of sealed second-price auctions? 

We think not. We do not doubt that bidders are often risk averse 
and that many bid takers would prefer more expected revenue to less. 
However, the interpretation of von Neumann—Morgenstern risk 
aversion in this context is perverse. In particular, risk averse does not 
mean “cautious.” Because of the independent private-values context, 
the “risk” to which risk-averse bidders are averse is the risk of not 
winning the auction. That outcome with its zero profit is assumed to 
be the worst possible event. There is no allowance, except, perhaps, in 
the private value, for any chance that the subject of the auction will be 
worth less to the bidder than he anticipated. In the context of sealed 
bidding, the cautious bidder may feel comforted by the safety margin 
built into his optimal bidding strategy in a first-price auction, but 
panicky about the chance that he will actually have to pay the true 
value his optimal second-price strategy calls on him to make. If he is 
more concerned about losing after winning the auction than about 
merely losing the auction, the standard theory for risli#i(Verse bidders 
with independent private values may not describe his behavior. Thus 
it is not at all clear that bid takers would actually prefer sealed second- 
price procedures. 

However, not only is it unclear that bidder risk aversion would lead 
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bid takers to prefer sealed second-price auctions; it is also unclear that 
bid takers get to choose the auction form without any other considera¬ 
tions. Bidders may be able to choose whether to participate, and that 
choice may be influenced by auction form. Most of the high theory 
on auction form summarized so well by McAfee and McMillan (1987) 
assumes that bidder participation is unaffected by the choice of form. 
Engelbrecht-Wiggans (1987) has recently argued that there are rea¬ 
sons for questioning that assumption. He shows the potential impact 
of a shift in a bidder’s auction participation decision on the bid tak¬ 
er’s decision on choice of reservation price. Furthermore, Rothkopf 
(1986) unearthed some old empirical evidence (Albion 1961) that bid 
takers have indeed profited from increased auction participation be¬ 
cause of their choosing an auction form that appealed to bidders. 
Hence, we are not persuaded that bidder risk aversion is an important 
reason for the scarcity of Vickrey auctions. 

Finally, we see little use of Vickrey auctions in situations in which 
bidders are unlikely to be significantly risk averse. 


Bidder Asymmetry 

Vickrey’s revenue equivalence results do not hold if bidders are asym¬ 
metric in the sense that, a priori, one can make statements distin¬ 
guishing bidders’ relative value for the subject of the auction or their 
relative information situation. Vickrey analyzed and discussed this. 
He argued that since auction forms were not usually varied from 
auction to auction, there was unlikely to be a long-term allocative 
difference. However, the Pareto optimality of the second-price auc¬ 
tion becomes more important when the situation is asymmetric. 

We see no flaw in Vickrey’s discussion. We do not believe that bid¬ 
der asymmetry is likely to be a significant reason for bid takers to 
prefer first-place auctions. Furthermore, as we argued above in the 
discussion of risk aversion, we do not believe that bid taker preference 
in the context of models with a fixed number of bidders is necessarily 
controlling about auction form. 


Nonindependent Values 

When Vickrey wrote his paper, the only academic discussions of auc¬ 
tions that considered the issue assumed independent private values. 
Later, other work appeared that developed models based on an as¬ 
sumption that bidders had a common (but unknown) value (Rothkopf 
1969; Capen, Clapp, and Campbell 1971; Oren and Williams 1975). 
Owing particularly to the persuasiveness of Capen et al., this became 
the preferred form for modeling many auctions, especially offshore 
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oil lease sales. In 1982, Milgrom and Weber generalized the range of 
assumptions considered by defining and obtainihg results for a class 
of “affiliated” values. Roughly, if values are affiliated, then a high 
value estimate by one bidder is evidence for a higher value for all 
bidders. Common-value models are a special case of affiliated value 
models. 

Milgrom and Weber have shown that in an affiliated value model, 
auctions can be rank-ordered according to expected revenue to the 
bid taker at equilibrium. The rank order is (1) (a somewhat artificial 
version of) an oral auction, (2) sealed second-price bidding, and (3) a 
tie between first-price sealed bidding and Dutch oral auctions. Hence, 
it is hard to argue that Vickrey auctions are rare because bid takers 
avoid them because of nonindependent values by bidders. 


Inertia 

It can be argued that Vickrey auctions are rare because institutions 
are slow to learn and change. In other words, the rarity of Vickrey 
auctions is evidence of a kind of implementation problem. We have 
two reasons for disbelieving this argument. First, while institutions are 
slow to change, we doubt that they are that slow purely for reasons 
of inertia. It has been over a quarter century since Vickrey’s paper 
appeared. During that time, there have been some experiments 
and some modifications of particular auction practices. Many more 
changes have been considered seriously. In addition, some completely 
new auction markets have been started. 

Second, even the quarter century time scale is misleading. Most 
auction procedures that are common developed before any formal 
analysis recommended them. Hence, one must wonder why, if there 
are no problems with it, some auction market did not stumble or 
evolve into sealed second-price bidding and recognize its advantages 
even before 1961. 


Bidder Fear of Bid Taker Cheating 

There is evidence that robustness with respect to the possibility of 
cheating is more influential than optimality in the absence of cheating 
in determining auction form. Robinson (1984, 1985) makes this case 
with respect to cheating by bidders. In his 1985 paper, he argues that 
standard sealed first-price bidding is sometimes used where, from the 
point of view of theory developed on the assumption of no cheating, 
one would expect to find oral progressive auctions. This happens 
because agreements by bidders to collude in oral auctions are stable, 
while in sealed first-price bidding they are not. While he discusses that 
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argument primarily with respect to a comparison between oral auc¬ 
tions and sealed first-price bidding, he points out m a footnote that it 
still applies if sealed second-price bidding is subsututed for oral auc¬ 
tions or if Dutch oral auctions are substituted for sealed first-price 
bidding. Presumably, a bid taker in an oral auction who fears collu¬ 
sion by bidders can switch to a first-price sealed-bid system. If only 
some bid takers have that fear, then only some would switch, and 
there would be both oral and sealed first-price bidding. 

In addition to the problem of bidder collusion discussed by Rob¬ 
inson, price-enhancing activities by bid takers can be a concern of 
bidders. In some oral auctions, the use by auctioneers of shills and 
imaginary bids to force the price above the second-highest bidder’s 
value is notorious (Cassady 1967, chap. 12). If a bidder fears that such 
tactics are being used against him, he may be reluctant to bid to his 
full value. Such reluctance may be a wise strategy if the bidder has 
reason to believe that his intentions can be read by the auctioneer or 
will affect the auctioneer’s behavior in future auctions. In an oral 
auction, a bidder at least has the opportunity to observe the proceed¬ 
ings while he is bidding, and he can drop out at any time if he suspects 
that he is being victimized. In sealed second-price bidding, a bidder 
has no such ongoing protection. If he follows his no-cheating equilib¬ 
rium strategy, he must reveal his ultimate reservation price. If he 
fears that the bid taker, after observing this price, will insert an imagi¬ 
nary losing bid or a real losing bid from a confederate, then he has an 
incentive to bid strategically. Notice that actual cheating by the bid 
taker is not required to produce this result; mere fear of it (i.e., 
assigning a positive probability to it) will suffice. 

Vickrey anticipated this concern and suggested that it .might be 
countered by having the bids delivered to and certified by a trusted 
third party. However, even if the bid taker is scrupulously honest in 
his opening of the bids, his anticipating a bidder’s intentions could 
solicit an insincere or artificially increased bid from a confederate. 
Such a bid would cost nothing if it loses. If, by miscalculation, it were 
to win (and do so at too high a price), the winning bidder may be able 
to withdraw it or the bid taker may be able to find grounds to reject it 
or somehow to compensate its maker. Again, even groundless fear of 
such behavior by a bid taker would be enough to induce bidders to 
abandon the truth-revealing dominant strategy of the no-cheating 
model in favor of strategic behavior. 


Bidder Resistance to Truth-revealing Strategies 

People of our acquaintance with experience in conducting business 
are reluctant to reveal their true costs or valuations. They are strongly 
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conditioned to keep such information confidential. Even in a situation 
in which such conditioning is maladaptive, it would have to be over- 1 
come. However, we believe that such conditioning will not normally 
lead bidders to err. Vickrey’s model considers the auction as an iso¬ 
lated event. However, economically important auctions are seldom 
completely isolated events. A truth-revealing strategy may give away 
valuable information. It could reveal to potential competitors the ex¬ 
tent to which a firm’s technology was superior. Most important, it 
could reveal to others with whom the firm must subsequently negoti¬ 
ate precisely how much it can yield. 

In our recent work on the design of auctions for the purchase of 
electric power by utilities from cogenerators and other facilities qual¬ 
ifying under the Public Utilities Regulatory Policy Act (Rothkopf et al. 
1987), we realized that successful bidders have reason to anticipate 
extensive negotiations after the auction. In addition to negotiating 
details and arrangements with the utility awarding them the contract, 
most winning bidders will have to negotiate for financing, construc¬ 
tion, government permits, and labor. In these negotiations, a winning 
bidder would be at a distinct disadvantage if the other parties knew its 
true cost, especially if the cost were much less than the amount it was 
to receive. 

Winning bidders in other much-analyzed auctions also face subse¬ 
quent negotiations with parties possessing significant market power. 
Successful oil lease bidders must deal with drilling contractors, rig 
owners, and so forth. Successful coal lease bidders must deal with 
equipment suppliers, railroads, and coal purchasers. Successful con¬ 
struction contract bidders must deal with subcontractors and labor 
unions. 

Keeping winning bids secret is a potential way around this diffi¬ 
culty. There are two problems, however. First, secrecy may defeat the 
public scrutiny that is needed to assure the bidders or the general 
public of the honesty and fairness of the process. Second, secrecy is 
never complete. Secret information tends to give power to its holder, 
and even a small chance of a breach of secrecy justifies a deviation 
from the dominant truth-revealing strategies of the isolated auction 
model. 


A Model of a Vickrey Auction with Partial Loss of 
Revealed Rent 

In the previous section, we argued that bidders may have good rea¬ 
sons for resisting the use of truth-revealing strategies. One of the 
arguments we offered was that a truth-revealing strategy imposes on 
a successful bidder a disadvantage in subsequent negotiations. This 
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section considers a simple model of a low-bid-wins second-price auc¬ 
tion in which the winning bidder must negotiate with third parties 
such as labor unions or permitting authorities. We assume that the 
third parties have some market power and that, in addition to what¬ 
ever else they may charge, they also extract some fraction of the 
economic rent of the winning bidder as revealed by the difference 
between his winning bid and the amount he gets paid under the 
second-price procedure. In our model, the bidders’ equilibrium strat¬ 
egies take account of the effect of their bids on the winner’s subse¬ 
quent negotiations. After presenting the model, we point out that a 
key result in it, that at equilibrium all the expected cost of the cap¬ 
tured rent is passed on to the bid taker, could be anticipated from the 
application of the logic of Myerson’s (1981) revenue equivalence 
theorem and that this result therefore applies to a broad class of 
symmetric auction models. 

Our model is extremely simple. We assume a low-bid-wins auction 
with two bidders. Each bidder independently and privately learns his 
exact basic cost should he win the auction. A priori, the cost for each is 
independently and uniformly distributed from zero to one. Each bid¬ 
der then uses a strategy for his bid that is an increasing function of his 
basic cost and that is independent of the still unrevealed cost of his 
competitor. The auction is a second-price auction that awards the 
contract to the low bidder at the price of the higher bidder. However, 
third parties with whom the bidder must negotiate may learn of the 
difference between the low bid and the contract price and, on the 
average, are able to extract some fraction, a, of this difference from 
the winner. The bidders know that this may happen and take account 
of its possibility in deciding on their bids. We assume that each bidder 
is risk neutral and, thus, chooses to maximize his expected profit from 
the auction. We seek a symmetric set of Nash equilibrium strategies in 
which neither bidder can unilaterally improve his expected profit. 

Mathematically, we have basic cost c,, i = 1,2, for the two bidders. 
It is uniformly and independently distributed on [0, 1], Bidders fol¬ 
low strategies b,(c,), i ~ 1,2, that are increasing functions of c, with 
inverse functions b, ’(•). When bidder i has cost c,, his expected profit 
is given by 


£[%(c,)] - f {b/c,) - a[bj(c,) - b,(cj)] - c,}f(cj)dcj, 

jb r ’wo) 


i,j = 1, 2; j ft i. 


In this expression, the braces contain bidder i’s profit if he wins with a 
bid of bi when bidder y s bid is bj{c t ). The quantity f(cj) is the uniform 
probability density that bidder / has cost c r It is 1 on the interval [0, 1]. 
The integral is over those values of c } that will lead to bidder i winning. 
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The derivative of £[1^(4)] with respect to b, is given by 


d£[rr,fe)] = n 

db, L 


J */'«<,(O) 


<f(Cj)dCj 


- {bfo-'m - albjlb-'ib,)) - b,(c,)] - c,)f(b~ *(*,)) 
i,j = 1, 2; j ¥■ i. 


db:\b t ) 


Setting this derivative equal to zero for t = 1 and t = 2, using the 
symmetry condition b]{c t ) — b 2 (c 2 ) s b(c) and the relationships 
b,(br l ( c i)) = c„ 1 = 1 , 2, and 


db, l (b,(r,)) _ _1 


i = 1,2, 


and simplifying gives the differential equation that a symmetric equi¬ 
librium strategy b{c) must satisfy a(l — c)b'(c) = b(c) — c. It may be 
verified that the solution of this equation is 


b(c) = 


a + c 
a + T 


When both bidders follow this strategy, a bidder whose cost is c has an 
expected profit of 

£ M <>1 = 


This quantity is independent of a. Hence, all the rent captured by third 
parties is passed on to the bid taker. 

A priori, before learning of his cost, each bidder has a 50 percent 
chance of winning and an expected profit, independent of a, of 

£ .f(c)dc = */6. 

The expected cost of the lower-cost bidder is one-third. When a = 0, 
the expected payment of the bid taker is Vs + 2(‘/6) = %. 

With equilibrium bidding, the expected value of the higher bid is 
(3a 4- 2)/3(a + 1), and the expected value of the lower bid is (3a + 
l)/3(a +1). The expected difference between the bids is l/3(a +1), 
and the expected payment to the third parties is a times this amount: 
a/3(a + 1). As a fracdon of the cost to the bid taker, this cost is a/2(a 
+ 1). Thus if third parues can extract 10 percent of the difference 
between the bids, the extra cost to the bid taker is 4.5 percent. If they 
can extract half of the difference, the extra cost is 16% percent, and if 
they can extract it all, the added cost is 25 percent. 
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As mentioned above, the highlighted result with this model namely, 
that the entire cost of the revealed economic rent captured by third 
parties is, on the average, passed on to the bid taker—is not an artifact 
of some peculiarity of the particular model we have chosen to analyze. 
Rather, it is quite general. Myerson (1981) considers a rather general 
single-object auction model with n risk-neutral bidders. In it, the bid¬ 
ders share a commonly known joint prior distribution on private- 
value signals and have utility functions that depend on three argu¬ 
ments: the private signal, the probability of winning the auction, and 
the payment to or by the bid taker. The joint prior distribution is 
unrestricted except that each bidder’s signal is bounded above and 
below. Myerson considers auction mechanisms characterized by two 
vectors with one component for each bidder: one of win probabilities, 
p, and one of expected payments to the seller, x. Each of these vectors 
is a function of the vector of private signals. Myerson proves that for 
any feasible auction mechanism, there exists a feasible “direct revela¬ 
tion mechanism” (i.e., a scheme in which the outcome is based on the 
revelation of his signal by each bidder and in which each bidder has 
the incentive to reveal his signal truthfully), which is equivalent in that 
it gives to the seller and each bidder the same expected utilities. 

Restricting himself to such direct revelation mechanisms, Myerson 
then proves a theorem that implies that once we know who gets the 
object being auctioned in each situation (i.e., the vector p) and how 
much utility each bidder would get if his value estimate were at its 
lowest possible level, then the seller’s expected utility from the auction 
does not depend on the payment function x. In particular, the seller 
must get the same expected utility from any auction mechanisms for 
which (1) the object always goes to the bidder with the highest value 
above a prespecified reservation price, and (2) any bidder with the 
lowest possible value signal expects zero utility. This implies that the 
seller gets the same expected revenue in any symmetric situation (in 
which zero value is considered a possible signal) regardless of the 
auction form provided only that it leads to equilibrium bidding strate¬ 
gies that increase with the value signal. This, of course, includes stan¬ 
dard sealed bidding and Vickrey auctions. This is a crude summary of 
Myerson’s revenue equivalence theorem. 

Myerson, however, does not consider models in which there are 
payments to third parties that depend on the auction form. If such 
payments are included, then his revenue equivalence theorem still 
applies except that it is applied to the combined revenue of the bid 
toher and the third parties. That is what is invariant to auction form. 
Hence, under the conditions considered by Myerson, modified for 
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payments to third parties that depend on the auction form, the expected 
amount of any payment to third parties comes from the bid taker. With such ' 
third-party payments, it is the bidders’ expected revenue rather than 
that of the bid taker that is invariant across auction forms. 


Thoughts on Research on Auction Design 

Our work designing auctions for the purchase of electric power and 
our consideration of the reasons for the scarcity of Vickrey auctions 
have led to some thoughts on fruitful directions for research in auc¬ 
tion design. In recent years, there has been a magnificent flowering of 
mathematical analysis of models of single isolated auctions. We do not 
doubt the value of this research. In particular, this paper has bene¬ 
fited greatly from Myerson’s work. However, we believe that more 
emphasis on formulating rather than optimizing is called for and that 
the practical value of the conclusions of some mathematical research 
with respect to “optimal auctions” is suspect. 

There are many critical assumptions in most auction models. In 
particular, the assumption of a single isolated auction almost directly 
contradicts an assertion that the auction is part of an important 
stream of commerce. It is useful to study thread in order to improve 
clothing, and it is useful to study bricks in order to improve buildings. 
However, clothing designers would have reason to be suspicious of 
any conclusions on “optimal design of threads” that were indepen¬ 
dent of the intended garment, and architects would have reason to 
question results on “optimal bricks” that were independent of the 
building design and planned construction methods. So too it is with 
auctions. It is useful to study the effects of varying auction rules in 
mathematical models of an isolated auction, but the “optimal auction” 
is likely to be context dependent. 

We believe that the most important undone research of direct im¬ 
portance for the design of auctions has to do with identifying and 
including, even crudely, in auction models considerations currently 
neglected. We believe that this “low theory” will add to the practicality 
of auction design modeling and may well lead eventually to an en¬ 
riched mathematical “high theory.” 


Conclusions 

We believe that Vickrey auctions are rare for two reasons. First, they 
are rare because robustness in the face of cheating and of fear of 
cheating is important in determining auction form along with proper¬ 
ties related to economic efficiency and allocation in the absence of 
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cheating or fear of it. Vickrey auctions are not robust with respect to 
cheating and fear of cheating. 

Second, and most important, Vickrey auctions are rare because 
bidders are reluctant to follow the truth-revealing strategies that the 
“proper” operation of such auctions would require. Bidders have 
good reasons to be reluctant when they may lose a fraction of the 
economic rent revealed by the sealed second-price format in subse¬ 
quent negotiations. In equilibrium in auctions with symmetric, risk- 
neutral bidders, the entire cost of this capture of revealed rent is 
borne on the average by the bid taker. 

The few auctions we report that do resemble Vickrey auctions seem 
to have characteristics that tend to offset these concerns. The auctions 
of collectibles not only allow oral bids by serious bidders but are 
selling primarily to hobbyists who, we hypothesize, are less fearful of 
cheating and less concerned about others knowing their reservation 
prices. Financial instruments also appear to be a situation in which 
bidders, especially small ones, have relatively little to lose by revealing 
their reservation prices. 
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Equilibrium and Inefficiency in a Community 
Model with Peer Group Effects 


Charles A. M. de Bartolome 

New York Universtty 


Public-service output depends on input expenditures, on own per¬ 
sonal characteristics, and on the characteristics oi the other residents 
in the community (the peer group effect). In a community model 
with public expenditures set by voting, with migration between com¬ 
munities, and with land price differentials (capitalization), it is shown 
that communities may become heterogeneous in composition and 
(second-best) inefficient. This equilibrium occurs when the peer 
group effect is neither “too strong” nor “too weak.” The inefficiency 
arises because an externality is created by migration. The land price 
differential does not play the part of the “price” of the better peer 
group but of a transfer payment. 


I. Introduction 

In Tiebout’s (1956) seminal model of the local public sector, the pro¬ 
duction of the public service is determined solely by physical inputs 
and community size: Tiebout concludes that the local public expendi¬ 
ture level is unanimously desired by all residents within a community 
and is efficient. There is, however, strong empirical evidence that the 
community composition (the peer group) is also a major determinant 
of the level of public service. This paper presents a model of local 
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government with peer group effects and shows that an equilibrium 
exists in which residents in each community differ in their desired 
public expenditure level. This equilibrium occurs when the peer 
group effect is neither “too strong” nor “too weak” and is (second- 
best) inefficient. 

A model is developed with families that have children of either low 
or high ability. Education is chosen as the motivating example of a 
public service showing the peer group effect. Educational expendi¬ 
ture is financed by local taxes, and the school technology is such that 
all children living in the same community attend the same school. 
Two empirical findings motivate the model. 1 First, more able children 
receive greater benefit than less able children from expenditures on 
educational inputs such as more experienced teachers (Summers and 
Wolfe 1977, table 2). It might be supposed that better facilities such as 
laboratories also differentially benefit more able children. 2 This dif¬ 
ferential effect is similar to a difference in tastes. Following Tiebout 
and considering this input effect alone, efficiency requires the forma¬ 
tion of separate communities, families of more able children being 
grouped in communities with high input levels. Private incentives also 
promote separation: families of less able children migrate into com¬ 
munities with low-input schools in order to avoid the high taxes asso¬ 
ciated with high inputs. 

The second stylized fact concerns composition: the presence of 
more able children in the classroom has a large favorable effect on 
educational achievements, perhaps by imparting higher motivation or 
better learning discipline. This is the peer group effect considered in 
this paper. Henderson, Mieszkowski, and Sauvageau (1978) find that 
the increase in educational achievement, consequent on an improve¬ 
ment in peer group, is equal for all abilities and shows diminishing 
marginal returns. Coleman et al. (1966) and Summers and Wolfe 
(1977, table 1, rows 25-27) find that the improvement is greater for 
the less able child. 3 Considering this effect alone, efficiency requires 


1 These empirical findings are not unchallenged. Hanushek (1986) presents a good 
review of the educational production function literature. 

2 This differential benefit may be used to justify the “elite" schools of New York City 
and the former “grammar” schools in the United Kingdom. 

* This is interpreting Coleman et al. slightly out of context because they focused on 
the effect of mixing majority and minority children. “If a white pupil from a home that 
is strongly and effectively supportive of education is put in a school where most stu¬ 
dents do not come from such homes, his achievement will be little different than if he 
were in a school composed of others like himself. But if a minority pupil from a home 
without much educational strength is put with schoolmates with strong educational 
backgrounds, his achievement is likely to increase" (p. 22). In the strict context of this 
model. Summers and Wolfe find that the sensitivity of educational achievement to the 
percentage of high achievers is less for students with higher third-grade scores (their 
table 1, row 26). 
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the integration of communities with children of different abilities. 
Considering private incentives, the greater sensitivity of the less able 
children, found by Coleman et al. and Summers and Wolfe, implies 
that families of less able children will be prepared to pay a premium 
to migrate into a community composed predominantly of families of 
more able children, promoting integration. 

When either the input or the peer group effect is considered in 
isolation, private incentives lead to efficiency. I show that the conclu¬ 
sions may change when both effects operate simultaneously. If nei¬ 
ther effect is so strong as to dominate, laissez-faire may be unable to 
efficiently resolve the tension between the tendency to segregate (due 
to the different taste for inputs) and the tendency to integrate (due to 
the different taste for peer groups): all communities may include both 
family types, with different majorities in each area voting different 
educational expenditures. This arrangement is never efficient. The 
efficiency loss arises because the presence of the more able child is a 
favorable externality, increasing the educational achievement of all 
families in the community; this causes a divergence of private and 
social benefits when families choose residency. Because the pattern of 
migration is determined by the level of input expenditures across 
communities, the choice of inputs by voters in a community imposes 
an externality in adjacent communities. I also find that housing rents 
rise in the community with the better peer group (capitalization). 
However, in the pseudocompetilive allocation of resources, this rent 
differential does not play the part of the “price” of the better peer 
group but of a transfer payment. As opposed to the models of Flat¬ 
ters, Henderson, and Mieszkowski (1974), Stiglitz (1977), and Brueck- 
ner (1979), the policy response to correct for the market failure does 
not involve intercommunity transfers but a subsidy/tax schedule on 
local public expenditure. 

Arnott and Rowse (1987) use the peer group findings of Hender¬ 
son et al. and Summers and Wolfe to ask the first-best normative 
question: What school compositions maximize the sum of all test 
scores? In contrast, I characterize a positive equilibrium and then ask 
the normative question whether the outcome is efficient. My effi¬ 
ciency concept is second-best: the planner is restricted to allocations 
he can implement without knowing an individual's ability. 

On the topic of community composition, Pack and Pack (1978) 
empirically determine the spread in implied demands within a com¬ 
munity and find it to be too large for communities to be considered 


IT„i2V w e<:0n , 0m ‘ C '*Planation for the policy of comprehensive schools in the 

Cnlel a , ? g i° m , ! he COnteXt 0f ma J orU >' and minority students, considered by 
Coleman et ah, the policy response in the United States was forced mixing by busing. 
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homogeneous. In the model of Stiglitz (1977), rpixing arises because 
of advantages from the returns to scale in the provision of a local 
public good. Berglas and Pines (1981) find that mixing arises natu¬ 
rally with the provision of several public services with different cost 
structures. In Berglas (1976) and Stiglitz (1983), mixing occurs be¬ 
cause of complementarity in production between djfferent worker 
types. Brueckner and Lee (1987) suggest mixing because of altruism. 
This paper suggests that community heterogeneity may possibly be 
explained by the peer group effect. 

Education is used in this paper as the example to motivate peer 
group effects. There are many other cases in which peer group ef¬ 
fects appear important. McPheters and Stronge (1974), Pogue (1975), 
Oates (1977), and Schwab and Zampelli (1987) establish the impor¬ 
tance of community composition in the provision of public safety. In 
demand analysis, the willingness to pay for many innovative products 
may critically depend on the characteristics of the other buyers. The 
model can be readily reinterpreted to these cases. 

Any model of the local public-service sector must make assump¬ 
tions about (1) the technology of service production, (2) the method 
by which service levels are set, (3) the local land and housing markets, 
(4) the method by which expenditures are financed, and (5) the mod¬ 
el’s static or dynamic condition. The novelty of my model lies in the 
incorporation of the peer group effect as a determinant of the public- 
service output: as indicated, this follows the empirical evidence of 
Coleman et al. (1966), Summers and Wolfe (1977), and Henderson et 
al. (1978). I assume that there are constant returns to community size: 
this follows the empirical finding of Borcherding and Deacon (1972) 
and Bergstrom and Goodman (1973). Service levels in my model are 
set politically by majority voting. That local public-service levels are 
“as if” voted by the median voter is supported empirically by Pom- 
merehne and Frey (1976) and Inman (1978). Voting to set service 
levels is used in the models of Westhoff (1977), Rose-Ackerman 
(1979), Bewley (1981), and Epple, Filimon, and Romer (1983, 1984).® 

Effectively, I assume that communities have fixed populations and 
that expenditures are financed by a local head tax. The rate of change 
in the U.S. housing stock is much lower than the rate at which families 
change houses, 6 so that the model should be interpreted as a short- 

* Other allocation mechanisms are possible: in Stiglitz (1977), Berglas and Pines 
(1981), Epple and Zelenitz (1981), Henderson (1985), and Brueckner and Lee (1987), 
public-service levels are set by profit-maximizing property developers. 

6 As seen in data From the 1980 Census of Housing, vol. 1, pt. 1, the number of 
households that changed dwellings January 1979-March 1980, as a percentage of all 
households, was 22.7 percent (table 78). In contrast, the number of new dwellings built 
in the same 15 months, as a percentage of the total housing stock, was 3.5 percent (table 
80). This suggests considerable fixity in the U.S. housing stock in the short run. 
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run model. The head tax ensures that the inefficK.i„.) -s not related to 
a distortion in the housing market. My structure is equivalent to an 
institutional structure in which the number and size of houses in each 
community are fixed, and there is a property tax. My approach is 
similar to the bid-rent approach of Wheaton (1977). Oates (1969) and 
many other authors find capitalization, which suggests that some 
housing variable is relatively fixed. 7 Property taxes are assumed in all 
the cited models with land and housing markets. 8 In addition, Hamil¬ 
ton (1975) shows that a property tax, when used with zoning, is equiv¬ 
alent to a local head tax. 9 Finally, my model, like all the cited models, 
is static. 

To distinguish the peer group effect from income effects, I initially 
assume that all families have equal incomes. This assumption is re¬ 
laxed in Section III. Oates (1977) conjectured that peer group quality 
might be proxied by income, a conjecture that Hamilton (1983) used 
to explain the “flypaper” effect. This approach is followed in Section 
III. With equal incomes, the similarity of housing is unimportant. 
With variable incomes, I would expect housing sizes to vary, but even 
without peer group effects, this introduces many additional complex¬ 
ities that I want to avoid (see, e.g., Rose-Ackerman 1979; Epple et al. 
1984). In addition, I assume that there is no private educational sec¬ 
tor. The possibility that families will opt out of public schools and into 
private schools has been studied by Stiglitz (1974). In the United 
States, private expenditure on elementary and secondary schools is 7 
percent of total expenditure, 10 so that this is unlikely to be a major 
factor in determining public school expenditure (although it may be 
important within a few communities). Within schools there is “track¬ 
ing,” so that different children within the same school are treated 
differently. However, mixing outside the classroom also affects learn- 


Assumptions ol greater flexibility are of course possible: EUickson (1971), Rose- 
Ackerman (1979), and Epple et al. (1983) assume that the total housing stock is fixed in 
each community but that the units of housing per family (or the population density) are 
variable. Epple and /e ten it; (1981) and Epple et al. (1984) assume that land per com¬ 
munity is fixed but that land and nonland factors are used to manufacture housing. 
Henderson (1985) argues that community land areas have been relatively fixed in the 
northeastern United States but flexible elsewhere in the United States. 

For 1984/85, public elementary and secondary schools in the United States were 
funded by 6,5 percent lederal aid, 48.8 percent state aid, and 44.7 percent local reve¬ 
nue (1987 Digest of Educational Statistics, table 93), A large part of federal and state aid is 
categorical. A proportional subsidy is easily accommodated in my model by redefining 
the unit ol inputs as the amount of inputs bought by $1.00 raised locally. 

Other financing assumptions are possible. Berglas and Pines (1981) finance expen- 
ioti. if c*mon/iii n ® ruec *nerand Lee (1987), developers sell membership rights. 
" e '®’T x P en “' ture on elementary and secondary schools in the private 

sector was $8 bdlion afid in the public sector was $104 billion (1987 Digest of Educational 
Statistics, able 22). 1 
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ing attitudes: the relevant peer group variables in Summers and, 
Wolfe (1977) refer to the school. 11 

The paper is organized as follows. The positive model is described 
in Section II. Secdon III establishes the existence of equilibrium when 
communities both are heterogeneous and have majorides of different 
types; existence requires that the peer group effect be neither “too 
strong” nor “too weak.” The normative model is solved in Section IV 
and the inefficiency of laissez-faire is established. The cause of the 
inefficiency, the role of the rent premium, and policy implications are 
discussed in Secdon V. Section VI presents concluding remarks. 


II. A Positive Model 

The model has families of two types. There are N] families of children 
of low ability a x , and V 2 families of children of high ability a 2 . The 
educational achievement e of a child depends on his or her ability a, 
per capita educational inputs /, and the proportion 0 of more able 
children in the community (the peer group), e = e(I, 0, a). A more 
able child is assumed to gain more from educational inputs, or 

de(I, 0, fli) < de(I , 0, a 2 ) 
dl dl 

A less able child is assumed to gain no less from a peer group im¬ 
provement, or IJ 

de(I, 0, fl 2 ) ^ de(I, 0, a\) 
dQ ~ 30 

The utility U of a family decision maker depends on family con¬ 
sumption C and the educational achievement e of the child. As indi¬ 
cated in the Introduction, housing is fixed and all houses are identi- 


11 In Summers and Wolfe, the reievanl peer group variables are percentage of high 
achievers and percentage of low achievers in all grades of pupil’s sixth-grade school. 

IS Henderson et al. (1978) find that the peer group effect is concave but independent 
of ability. The lest score of a child in a grade 3 mathematics test is 

e - a (/, a) + 1.761(mean class IQ) - .015(mean class IQ)*, 

where a is the intercept. 1 interpret mean class IQ to be a proxy for the peer group. 
Summers and Wolfe (1977, table 1, eq. 2) estimate expected sixth-grade test score as 

e = a(/, a) + ,02(teacher's experience) x (third-grade score) 

+ .68(% high achievers) - .01 (% high achievers) x (third-grade score) 

- .08(% low achievers), 

where a is the intercept. I interpret third-grade test score as a proxy for ability, 
teacher’s experience as one measure of inputs (more experienced teachers cost more), 
and % high (low) achievers as the percentage with high (low) ability. 
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cal; housing per se is therefore omitted as an argument in the utility 
function For analytical convenience, utility is assumed to be addi- 
tively separable. A family with a child of low ability obtains utility as 

U(C, e(I, e, a,)) = F(C) + G(I) + //(0), 

and a family with a child of high ability obtains utility as 

l/(C, <?(/, 6, o 2 )) = F(C) + f?(/) + 5(0), 

where F, G, R, H, and 5 are strictly concave functions. The greater 
input sensitivity of the more able child is interpreted to imply 

G'(/) < /?'(/)• (1) 

The peer group effect is interpreted to imply 

S'(0) s H'(6), (2) 

where I will associate equality with the case of Henderson et al. (1978) 
and strict inequality with the case of Summers and Wolfe (1977). 13 A 
family is henceforth labeled by the ability of its child. 

Because peer group effects should at least initially be differentiated 
from income effects, all families are assumed to have equal endowed 
income y. (This assumption is relaxed in Sec. III.) With only two 
family types, the discussion can be confined to two communities 
labeled as the urban area u and the suburbs s, with the suburbs being 
the area with the better peer group. 14 The urban area contains n u 
families and the suburban area n, families. As indicated in the In¬ 
troduction, n u and n, are fixed. I am interested in how families of 
different abilities distribute themselves between the two areas. If we 
denote the urban composition as and the suburban composition as 
0„ the more able families must be distributed as 

N‘i = n,0 5 + n u 8„ or 6„ = —- 

n u 

The school technology is such that all f amilies resident in a commu¬ 
nity attend the same school, all urban families receiving inputs /„ per 

The assumption of strict concavity ol H( i and S( ) is made for cxpositional simplic¬ 
ity only and is actually unnecessarily strong. My results require only either S' £ H' and 
S’. H" < 0 (as in Henderson et al.) or S' < H’ and S", H" £ 0 (as in Summers and Wolfe). 

Proposition 1 below implies that no generality is lost by considering only two 
communities. If there were more than two communities, those with majorities of the 
same type would become indistinguishable, so that the metropolitan area would behave 
as if it comprised only two communities. However, with many communities, there are 
potentially many equilibria, depending on which communities are of one type and 
which are of the other A full model would need to explain how a community’s type is 
determined. Pending this extension, each community type is considered to be exoge¬ 
nously determined. 
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head and peer group 0 U . Inputs are financed by, local taxes. Educa- 1 
tional costs are assumed to show constant returns to community size, 
and the unit of input is chosen so that one unit of input costs one unit 
of consumption (the numeraire). The optimal outcome of any public- 
policy discussion critically depends on the instruments available to 
government. This paper assumes the second-best world in which the 
tax collector is constrained to levy identical taxes on all families in the 
community. 15 Each urban family therefore pays educational taxes 7 U . 
The housing rent in the urban area is r„, and the central or metropoli¬ 
tan government may give a lump-sum transfer T to families so that 
the consumption of an urban family is C„ = y - /„ - r u + T. Similar 
notation applies for a suburban family that achieves consumption C s 
- y ~ L ~ r > + T. In order to keep the model closed, I assume that all 
rents are collected by a central government and returned as the lump¬ 
sum transfer T. The central government budget balance requires that 
T = (w,r 4 + n u r u )/(n< + n u ). By substitution, the net suburban rent r 
and net urban rent (after transfers) become 

r « r , - T = n " (r ' ~ -- K 
n, + n u 

„ _ 'j' _ W.s(? u 7\) __ _ tl s 

’ u * , ' • 

n s + n u n u 

The consumptions of an urban family and a suburban family become 

C„ + (£)-, 

C, = y - l, - r. 

The assumption that the central government returns rent revenue 
as lump-sum transfers is made for expositional simplicity only and is 
not important to the results. If rents were not returned, the model 
would have to include the welfare effects on landlords living outside 
the system. The important point, which is captured in the construc¬ 
tion, is that the family budget available to finance consumption and 
pay taxes does depend on where the family lives. 

Inputs are determined by majority voting. A family votes myopi¬ 
cally, ignoring the possible effects of the input level on the house rent 
and on community composition, 16 so that a majority of less able 


n This assumption can be motivated either by the positive observation that educa¬ 
tional taxes are not levied according to the child's ability or by information constraints: 
individual ability is private information but mean ability is observable. 

1R When families vote ignoring all effects on rents or house prices, the distinction 
between renters and house owners is irrelevant. 
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families (in the urban area) would vote inputs as 

K < .5: I u * argmax F^y - 1 + ^jrj + G(/) 

or F'fCJ = G'(/J. 

A majority of more able families (in the suburbs) would vote inputs as 


.5 £ 0,: I s = argmax F(y - I - r) + R(I) 
or F'(C,) = «'(/J. 


(4) 


Community compositions and rent premium are determined 
through migration. A family evaluates each community, taking the 
input level, the associated tax, the housing rent, and the peer group as 
given, and moves into the community that gives the highest utility. 
Community compositions and house rents adjust until no family can 
obtain a higher utility by moving. No migration by the less able fami¬ 
lies implies either 

fl, < 1, F(y - I, - r) + G(/ t ) + H( 8,) 

- 4 - + fc) r ) + G( '-> + <5a) 

or 

0, = 1 ,F(y - l s - r) + G{I S ) + H(\) 


s 4 - '■ + &) + G< '-> + « 5b > 

Similarly, for the more able family, either 

0 < 0 U , F^y - I u + + «(/„) + ~ W,e, j 

= F(y - I, - r) + f?(/ t ) + S(0.,) (6a) 

or 


0 = 0„, Ffy - I u + gijrj + fl(/„) + S(0) 


£F(yr) +«(/,) +(6b) 

Equations (1)—(6) determine the model. The equilibrium values (0 f , 
r, /„, /,) are the solutions to equations (3)-(6). Proposition 1 below 
ensures that, if the two communities have majorities of the same 
ability level, they are identical. 
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Proposition 1. Equilibrium requires either 0„ < .5 £ 0, or 0 U = 0 S , 

Proof. See Appendix A. 

Five different solutions are possible. Solution 1 corresponds to het¬ 
erogeneous and distinct communities: Solution 1 has 0 < 0„ < .5 £ 0, 
< 1 and solves equations (3), (4), (5a), and (6a). Solutions 2, 3, and 4 
correspond to the corner solutions: Solution 2 has 0 < 0 U < .5 and 0, 
= 1 and solves equations (3), (4), (5b), and (6a). Solution 3 has 0 = 0 U 
and .5 s 0, < 1 and solves equations (3), (4), (5a), and (6b). Solution 4 
has 0 = 0„ and d s = 1 and solves equations (3), (4), (5b), and (6b). 
Finally, there is the solution with identical communities, or solution 5 
has 0„ = 0 S = iV 2 /(ATi + 7V 2 ), r = 0, and /„ = /,, which solves either 
equation (3) (if 1V 2 /[N] + N 2 ] < .5) or equation (4) (if .5 £ N 2 /[Ni + 
JV 2 ]). 

The solution with identical communities (solution 5) always exists. 
The interesting possibility on which I focus occurs when the two 
communities are dissimilar. The condition that the majorities in the 
two communities are of different'ability level, or 0„ < .5 £ 0,, corre¬ 
sponds to a lower bound 0 to permissible 0,: 

0 = max(.5, value of 0, when 0 U = .5) £ 0, 


or 


. / . N 2 - iVj 4- n s \ „ 

e . raax(.5, 


The feasibility conditions 0 £ 0 U and 0, £ 1 imply 
upper bound 0 to permissible 0 V : 

0 , £ 0 = min(l, value of 0 , when 0 „ = 


that there is an 

0 ) 


or 

0 , £ 0 ^ min^l, 

The solution procedure is first to solve equations (3), (4), (5a), and 
( 6 a), without any restriction on 0 ,; that is, to find the rent and com¬ 
positions that ensure no migration when assuming that urban and 
suburban inputs are “as if” voted by majorities of less able and more 
able families, respectively. I impose later the necessary conditions on 

0j> 0 £ 0 S £ 0. 

From equations (3) and (4), inputs are functions of r only and can 
be written as 7 u (r) and I s (r): because of the assumed separability, voted 
input levels depend on net income and not on the peer group. The 
A A' curve in figure 1 defines the (0*, r) combinations for which less 
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able families do not migrate, or, from equation (5a), 


F(y - /,(r) - r) + G(/,(r)) + H(Q S ) 



C VM ) + 


(5') 


Proposition 2. The AA' curve is upward sloping and passes below 
the point (0, — N 2 /(Ni + A^), r = 0). 

Proof. See Appendix A. 

The rent premium r on the AA' curve is interpreted as the less able 
family’s willingness to pay for any given suburban environment 0,: its 
willingness to pay increases as the suburban composition improves. 
The AA' curve passes below the point (0, = N 2 /(N i + N 2 ), r = 0) 
because less able families are choosing the urban consumption/input 
bundle. If there is no composition difference, a less able family will 
remain in the suburbs only if the higher tax burden is offset by a 
lower rent. 

A similar description applies for the more able families. On the BB' 
curve, more able families do not migrate, or, if we use equation (6a) 
and remember that /„ and /, are functions of r only, 


Fly - Ur) + (— -V) + R(Ur)) 

V \n u ) j 



- w, 0, 
n„ 


= F(y - 7,(r) - r) + R(f(r)) + 5(0,). 


( 6 ') 


Proposition 3. The BB' curve is upward sloping and passes above 
the point (0, = N 2 /(N t + N 2 ), r = 0). 

Proof. See Appendix A. 

The BB' curve is interpreted as the more able family’s williugness to 
pay for any given suburban environment 0,. It is upward sloping But 
passes above the point (0, = N 2 /(N t + N 2 ), r = 0): the suburban 
consumption/input bundle is being chosen by more able families so 
that, if the two communities have identical composition, the more able 
families are prepared to pay a premium to live in the suburbs. 

Proposition 4. The intersection of the A A' and BB' curves is 
unique. 

Proof. See Appendix A. 

7 he simultaneous solution of equations (3)—(6) is given by the in¬ 
tersection (0*, r*\ of the A A' and BB' curves. Below 0*, the composi¬ 
tion difference is relatively small so that the input difference domi¬ 
nates. the willingness to pay of the urban more able family (given by 
the BB curve) exceeds the willingness to pay of the suburban less able 
family (given by the AA curve). Migration therefore occurs with 0, 
rising. Conversely,-above 0* the peer group difference dominates and 
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'Fig. 1.—"Equilibrium" (8„ r) combinations ensuring no migration 


urban less able families outbid suburban more able families with 0, 
falling. 

Equilibrium with dissimilar communities requires no migration and 
that the consequent compositions be of the assumed majorities and 
feasible, or 8 ^ 0* s i, The three possibilities for dissimilar com¬ 
munities are shown in the panels of figure 2 and are described as 
follows: (a) Either 0* < 0 or 0 < 6 (there is no permissible range). 
Dissimilar communities cannot coexist, and integration (solution 5) is 
the only equilibrium, (b) 0 < 0* < 5. With this possibility, both com¬ 
munities are heterogeneous and have different majorities (solution 
1). (c) 8 s 8 < 0*. In this case, the more able family’s willingness to pay 
to move to the suburbs (higher inputs) always exceeds the less able 
family’s willingness to pay for the better peer group. So 0 S = 5 is the 
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Fic. 2.—Characterizing the asymmetric equilibrium 


solution, or at least one community is homogeneous (solutions 2, 3, 
and 4). 17 

III. Comparative Statics and the Existence of 
Solution 1 

I use the comparative statics to establish the following proposition of 
existence. 

Proposition 5. An equilibrium with heterogeneous and dissimilar 
communities (solution 1) exists when the peer group effect is neither 
too strong nor too weak with respect to the input sensitivity differ¬ 
ence, or when the income difference is not too large, and provided 
0 < 0 . 

The statement “the less able become more peer group sensitive” is 
interpreted to mean a change to a different H(%) function, for which 
the differential /f'(8) is larger at all 0 or for which H(Q S ) — H(Q U ) is 
larger at given 0, and 0 U . Rearranging equation (5'), we get 

H(9.) ~ H(Q u ) = F[y - l u (r) + (^)r) + G(/ U (r)) 

\ \n u j ) (5”) 

- F(y - I,(r) - r) - G(/,(r)). 

To find the shift in A A' as the function H(-) changes, consider the 
implied changes in 8* when r is held constant. With r constant, the 
right-hand side is unchanged, and therefore 8, must fall to restore 

With panel c of fig. 2, if 6 = 1 and the associated 0„ > 0 (i.e., solution 2), more able 
families a ^ e .* n communities and the rent is bid up by the more able urban families 
to their willingness to pay r*. Alternatively, if § < 1 (i.e., solution 3), it is the less able 
families that are in both communities, and the rent is bid up to their willingness to pay 
r r (If » “ 1 and the associated 8„ = 0 [i.e., solution 4], the rent is undetermined in the 
range fr*, r*]. This last possibility requires that the community sizes be n„ = N t and n, 
= N t .) 
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Fig. 3.—Increasing the peer group sensitivity of less able families 


the equality of equation (5"). Put differently, the A A' curve shifts to 
the left (fig. 3), and the equilibrium values of r and 0, fall. The in¬ 
creased attractiveness of the suburban peer group increases the ten¬ 
dency of the urban less able families to migrate, making the com¬ 
munities more alike: it should be noted that, even though a less able 
family has a higher willingness to pay for a given peer group differ¬ 
ence, the general equilibrium effect is to lower the suburban rent 
premium after the change in 0, is taken into account. 

Similarly, as more able families become more peer group sensitive 
(the function S(0) changing to a function with a larger differential 
S'(0)), the BB' curve shifts to the left. In the case of Henderson et al. 
(1978), all families are affected equally by the peer group, //'(0) = 
S'(0). In this case, an increase in the strength of the peer group is 
associated with equal leftward shifts of the AA' and BB' curves, or r* 
is unchanged but 0* falls. 18 In the case of Summers and Wolfe (1977), 
less able families are more sensitive to the peer group, and I associate 

18 In the case of Henderson et al. (//(•) ■■ S(-)), the suburban rent premium r* adjusts 
until / u = Communities are dissimilar in composition but similar in expenditure per 
head. This is easily seen from the lemma in App. A. 
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an increase in the strength of the peer group effect with an increase in 
the sensitivity difference , that is, with the functions changing so that 
H\ 0) rises or S'(6) falls, and 0* falls. 

As the peer group effect becomes stronger, 0* falls and there is a 
progression from panel c to panel a, in figure 2. For effects of inter¬ 
mediate strength, panel b or solution 1 is the equilibrium. These 
comparative statics show that heterogeneous communities of unequal 
comjxisition (i.e., solution 1) can always exist provided that the peer 
group effect is of intermediate strength (and provided that the com¬ 
munity sizes are such that 0 < 0). Without the peer group effect, either 
one community would be homogeneous (solutions 2, 3, and 4) or both 
communities would be identical (solution 5). 

In Appendix B, the comparative statics are repeated for a change 
in the input sensitivity difference. As the more able families become 
more input sensitive (increasing the input sensitivity difference), the 
A A' curve shifts to the right and the BB' curve shifts to the left, so 
that there is a progression from panel a to panel c in figure 2. Pro¬ 
vided that the input sensitivity difference is of intermediate strength 
(with respect to the peer group sensitivity difference), the outcome of 
panel b is achieved: both communities are heterogeneous but dissimi¬ 
lar. Intuitively, the peer group effect is associated with a tendency to 
integrate and the input sensitivity difference is associated with a ten¬ 
dency to segregate: two heterogeneous but dissimilar communities 
exist if neither effect is too strong. 

The assumption of equal incomes is now relaxed. Following Oates’s 
(1977) conjecture, it is assumed that socioeconomic desirability is 
proxied by income: each less able family has income y i and each more 
able family has income y 2 , yi < y 2 . The shift in AA\ as y 2 is increased 
by Ay 2 from the point of equal incomes Vi = y 2 = y, is obtained hy 
differentiating the amended equation (5'), holding r fixed and using 
equations (1) and (4): 

Aff _ [F'(Q - C'(/,)]0/,/0y 2 

Ay 2 r (nJn u )H'(d u ) + H’d) ( ’’ 

or the A A' curve shifts to the right. Similarly, BB' shifts to the left. 
Descriptively raising the income of the more able families increases 
their willingness to pay for inputs and is qualitatively similar to raising 
their input sensitivity. Little generality is therefore lost with the equal 
income assumption. 

IV. The Normative Model 

The efficient outcomes achievable by a planner are characterized in 
this section. In the next section these will be used to show that the 
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laissez-faire outcome with two heterogeneous and dissimilar commu¬ 
nities (solution 1) is always inefficient. The planner chooses consump¬ 
tions C u , C, and inputs l u , I s and allocates families to communities by 
choosing 8 U , 0,. Because the allocation of the planner will be the yard¬ 
stick with which the laissez-faire outcome will be measured, the plan¬ 
ner is constrained to those choices he could implement when faced 
with the same institutional constraints that operate in laissez-faire; 
namely, all families within a community must be treated equally, and 
no family must wish to “undo” the allocation by migrating. The plan¬ 
ner is also required to maintain the same community sizes. Therefore, 
the relevant concept is second-best efficiency. 

The planner is constrained to treat equally all families within a 
community. In the same way that the rent premium in laissez-faire 
transfers resources between the communities, the planner is able to 
effect intercommunity transfers. The planner chooses the parameter 
r in the consumption equations and interprets it as a tax on families in 
the suburban community and a transfer to families in the urban com¬ 
munity. Because the allocation must be implementable, the planner 
must also ensure that no family wishes to migrate: hence he is con¬ 
strained by equations (5) and (6), which are now interpreted as the 
self-selection constraints. 

The planner’s problem is therefore to choose /„, l„ 0„ and r to 
maximize the utility of the less able family subject to a reservation 
utility of the more able family, 

U(y - - r, e(/„ 8„ a 2 )) * U, (7) 

and ttye self-selection constraints (5) and (6). Constraints (5) and (6) 
constitute two equations that can be used to “solve out” for the two 
variables (0„ r) as functions of I u and /,. This can be interpreted as a 
central government setting input levels /„ and /, and allowing compo¬ 
sition 8, and rent premium r to adjust. 

The efficient allocation may be a corner solution. I am interested in 
the normative characterization of the positive outcome when both 
communities contain both types, so I introduce the additional con¬ 
straint that families must obtain equal utilities in both communities. 
Therefore, the planner’s problem is 


max U\y - I u + 

r \ 



A^2 - «.<0., . 

---, O] 


)) 


subject to (5a), (6a), and (7). 

Proposition 6. The solution to the constrained planner's problem 
has identical communities. 

Proof. The first-order condition is (if we notate U K (C, /, 0) as the 
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utility of a family of ability level a, in community e and notate the 
relevant derivative with a subscript) 


1 f 

(Up 

AJ'tLWl 

u} u \ 

1/1“ 1 

up 

U) u L 

\Up 

/ up V 

Up) 

Up] 

Up 

up 






1 

[7 u * 

0 u • + (l 

uY 

) 

u? 

up 

W Up 

/ up \ 

up 

) Up J 


Up Up 


This is satisfied with identical communities, 0„ = 6 S = N 2 /(N] + N 2 ),r 
= 0, and I u = The second-order condition confirms that this is a 
maximum, namely, that the determinant of the bordered Hessian is 
positive. Q.E.D. 

Proposition 6 establishes that any allocation with 0 < 0 M < 0, < 1 is 
second-best inefficient. There is always another allocation with 0„ = 0, 
that is Pareto-preferred (and satisfies the second-best constraints). 


V. Laissez-Faire and Efficiency 

Theorem. In the model of Section II, laissez-faire equilibrium with 0 

< 0 U < .5 s 0, < 1 (solution 1) may exist and is second-best inefficient. 
Proof. From proposition 5, there exist laissez-faire equilibria with 0 

< 0„ < .5 s 0, < 1. From proposition 6, second-best efficient alloca¬ 
tions with 0 < 0„ s 0, < 1 have 0 n = 0,. Q.E.D. 


What Is Causing the Inefficiency? 

The ability of the family affects the educational achievement of the 
other families in the community. All families in a community benefit 
when a more able family migrates in and suffer when it migrates out: 
this is the externality that is not being internalized by the migrant. 
When a more able family migrates from the suburbs to the urban area 
and a less able family reverse migrates, the willingness to pay in the 
urban area for the peer group improvement exceeds the compensa¬ 
tion required in the suburbs for the peer group deterioration. This 
net social gain arises because the effect is strongest in the urban area, 
partly because of the diminishing returns to peer group (assuming H" 
and 5 < 0, as in Henderson et al. [1978]) and partly because the 
urban area contains the greatest concentration of families that are 
highly sensitive to the peer group (assuming S' < H ', as in Summers 
and Wolfe [1977]). Because private incentives do not take this into 
account, migration stops when 0 f is “too high.” 

The gain arises because of the effect of the migrant on the environ- 
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ment, and not because of any differential effect of the peer group on (| 
the migrating less and more able families. It might be thought that 
mixing should be encouraged because the gain in the educational 
achievement of the migrating less able child that enters the suburbs 
exceeds the loss in educational achievement of the migrating more able 
child that enters the urban area. This may be true from a welfare 
perspective (e.g., if social welfare is the sum of individual educational 
achievements, as in Arnott and Rowse [1987]), but my explanation 
makes clear that it is not the cause of inefficiency. A migrating family 
obtains equal utility in either community: the Pareto improvement is 
possible because of the effect of the migrating family on the environ¬ 
ment. 

Although the planner is constrained to treat equally all families 
within a community, he is nevertheless able to make Pareto improve¬ 
ments by moving more able families into the urban area, compensat¬ 
ing the suburban area with lower rents and simultaneously adjusting 
inputs. A calculation of willingness to pay makes the argument pre¬ 
cise and interprets equation (8). Each urban family is willing to pay 
gross A u for the in-migration of n u dO u more able families (and for the 
reverse migration of an equal number of less able families). Because 
of the second-best nature of the problem, the planner could only 
make all families pay the same amount, so that inputs would have to 
be simultaneously adjusted to ensure that all urban families have 
equal willingness to pay for the overall change. The definition of A„ 
therefore implies 


dU lu = 0 = U^-dI u + ^jdr - A„ 
dU 2u = 0 - Ul u \ —dl u + {^jdr - A u 


+ 


+ 


U) u dl u + U\ u dK, 
Uj u dl u + Ul u dQ u - 


If we eliminate dl u , the net urban willingness to pay is 



The left-hand side of equation (8) should therefore be interpreted as 
the urban family’s net willingness to pay for a unit improvement in 
peer group. 

An analogous argument establishes that the right-hand side of 
equation (8) is the compensation that must be paid by each urban 
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family to suburban families for the unfavorable migration (after al¬ 
lowing for the suburban input adjustment necessitated by the second- 
best nature of the problem). Migration can be Pareto-improving until 
the willingness to pay equals the necessary compensation. This ex¬ 
plains equation (8). 

The planner can effect a Pareto improvement by using inputs as his 
only instrument. Higher inputs lead to an improvement in commu¬ 
nity composition: by ignoring this, the voter is underestimating the 
private marginal benefit of inputs. However, communities are linked 
by migration, so that an improvement in own composition implies a 
deterioration in the composition of the adjacent community. This is 
an ignored social cost. Because 0 U £ 0, and either because the peer 
group effect is strictly concave (as in Henderson et al.) or because the 
urban area contains a greater concentration of families that are highly 
sensitive to the peer group (as in Summers and Wolfe), the ignored 
urban benefit of increased inputs (measured by the willingness to pay) 
exceeds the additional social cost (measured by the compensation 
required by the suburbs), so that the urban voter is voting “too few” 
inputs. The converse applies in the suburbs. 


The Role of the Rent Premium 

In Tiebout’s (1956) model, families searching over communities are 
isomorphic to consumers shopping over bundles. The “price” of the 
local public service is its tax price, and the competitive forces that 
drive a private economy to efficiency also drive the Tiebout model to 
efficiency (Bewley 1981). In my model the suburban rent premium 
does not play the part of the price of the better peer group (to pro¬ 
mote efficiency): families are unable to make marginal trades—to buy 
more or less peer group at a stated price per unit. Instead, it acts as a 
transfer payment to ensure that families are indifferent about resi¬ 
dency. 


Relation to the Literature and Policy Implications 

In the models of Flatters et al. (1974), Stiglitz (1977), and Brueckner 
(1979), there is a “fiscal externality” of migration, and under laissez- 
faire, regions achieve inefficient population sizes. In order to in¬ 
ternalize the externality, the overpopulated region should provide 
grants to the underpopulated region, and public spending should be 
neither taxed nor subsidized. In my model, the inefficiency arises 
because communities achieve inefficient compositions. The externality 
arises because a firtigrating family affects the environment of the com¬ 
munity. The externality cannot be internalized by intercommunity 
grants because, as community sizes are fixed, any grant is immediately 
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capitalized so that net rents remain unchanged. I have shown that then 
planner is able to achieve efficiency using inputs as his only instru¬ 
ment. By ignoring the associated migration, the urban area voted “too 
low” inputs and the suburban area voted “too high” inputs. The cor¬ 
rect policy response is therefore to impose a nonlinear subsidy/tax on 
inputs, so that communities change their voted inputs, leading to 
integration. An alternative policy response is to merge the jurisdic¬ 
tions into a metropolitan area* 9 (or to choose community sizes so that 
0 < 0 or 0* < 0). 


VI. Concluding Remarks 

The empirical peer group effect may make communities heteroge¬ 
neous and dissimilar, with different majorities in different areas vot¬ 
ing different expenditure levels. This equilibrium may occur when 
the peer group effect is neither too weak nor too strong relative to the 
difference in tastes for inputs. The equilibrium is also (second-best) 
inefficient: the inefficiency arises from the externality generated by 
the migrating family on the environment of the community. The 
results are important for the current discussion on the devolution of 
authority to local government, both in the United States and in the 
United Kingdom. 

The model includes two important simplifications. First, all houses 
are identical, and, second, there are only two family types. These 
assumptions will be relaxed in future research. 


Appendix A 

Proof of the Propositions 

To prove proposition 1, I first establish the following lemma. 

Lemma. The resource allocation lemma. —In the model of equations (l)-(6), 
equilibrium with 0 U < 0, implies /„ £ /, and C s < G„. 

Proof of the lemma. If we consider the case in which both communities con¬ 
tain both types (the corner outcomes can be similarly proved), the no- 
migradon equations (5a) and (6a) become 

F(C,) + G(/,) + «(»,) = F(C U ) + G(/ u ) + H( 6 U ), 

F(C U ) + R(I U ) + S(0„) = F(C S ) + «(/,) + S(0,), 
or 

G(/,) - G(/„) + W(0 5 ) - H{%) = F(CJ - F(C S ) 

= R(I S ) - R(I U ) + S(0 s ) - S(0„). 


19 If the difference between families is interpreted to be racial, this policy would 
correspond to court decisions that put a whole metropolitan area under a desegrega¬ 
tion order with busing across local community lines. ■* 
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But equation (2) implies 5(6,) - 5(6.) s, H( 6,) - H(6 U ) and, hence, G(I S ) 
G(l u ) ss /?(/,) - /?(/„). Hence, with inequality (l),/ u — h- With /. £ r(CJ 
f(C,) = /?(/,) - R(Iu) + 5(6,) - 5(8.) > 0 and, hence, C, < C u . Q.E.D. 


Proof of Proposition 1 

Suppose that the proposition is false: the decisive voter in each community 
has the same ability and 6. < 8,. With the relevant equation of equations (3) 
and (4) and normality (implied by separable utility), the decisive family type 
votes for higher inputs only if it also gets higher consumption: 0 ss 8 U < 8, < 
.5 or .5 s 6„ < 6, £ 1, and C t < C u 3> /, < /*. This contradicts the resource 
allocation lemma above. 


Proof of Proposition 2 

To show that the A A' curve is upward sloping, totally differentiate equation 
(5') and use equation (3) to give 

dr_ I = _ (nJn u )H'( 6J + H'( 6,) _ 

dQ s \ M - [ (nJn u )F'(C u ) + F'(C S )] + [ F'(C S ) - G'(f)]dljdr ' 

Equations (1) and (4), with normality (—1 < dljdr < 0), imply F'(C,) > 
~[F'(C,) - G'(IJ]dIJdr > 0, or the denominator is positive and the AA' 
curve is upward sloping as claimed. 

If both communities were to have the metropolitan composition 6„ - 6, = 
N%I(N i + N 2 ), r - r A on AA' such that 

F(y - l,(r A ) - r A ) + G(I,(r A )) = - L(r A ) + + C(I u (r A )). (5 m ) 

To find the value r A that solves this equation, consider r A = 0. The term /„(r^) 
is the input level chosen by the less able families so that, at r A = 0, by revealed 
preference, F(y - /,(())) + G(/,(0)) < F(y - /JO)) + G(/J0)). Normality 
implies - 1 < dl,ldr < 0. As r A falls, therefore, the left-hand side of equation 
(5'") increases monotonically. Similarly, the right-hand side decreases mono- 
tonically. Therefore, the only solution to equation (5"') has r A < 0, as claimed 
in the proposition. 


Proof of Proposition 3 


To show that BB' is upward sloping, totally differentiate equation (6'), and 
use equations (1), (3), (4), and normality (0 < dljdr) to give 


dr 

w, 


I BB' 


_ (nJn u )S'($ u ) + S'(0') _ 

[(nJnJFJCj + FJCJ] + [/?’(/„) - F'(C u )]dIJdr 


( + )• (A2) 


If both communities were to have the metropolitan composition 9 U = 6, = 
N 3 /(N l + A/ 2 ), r = r B on BB' such that 


r(y - Iu(r B ) + + R(I u (r B )) 

= F(y - IJr B ) - r B ) + R(IJr a )). 


To find the value r B that solves this equation, consider r B = 0. The term IJr B ) 
is the input level chosen by the more able families so that, at r B = 0, by 
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revealed preference, F(y — 7 U (0)) + R(IJO)) < F(y - 1,(0)) + ft (7,(0)). Nor-., 
mality implies 0 < dljdr < rijr^. As r B rises, therefore, the left-hand side of 
equation (6"') increases monotonically. Similarly, the right-hand side de¬ 
creases monotonically. Therefore, the only solution to equation (6”) has 0 < 
rs, as claimed in the proposition. 


Proof of Proposition 4 

With equations (Al), (A2), (2), and normality, at any point of intersection 


dr 

dO, 


BB' 


dr 

dO, 


or AA' must cross BB' from below. This implies that there is at most one 
intersection. 

Propositions 5 and 6 are proved in the text. 


Appendix B 

Comparative Statics of a Change in Input Sensitivity 

The utility functions are rewritten as 

U(C, e(I. e, a,)) = F(C) + aG(I) + 7/(0), 

U(C, e(l , 6, a 2 )) = F(C) + bR(I) + S(9). 

Increasing the input sensitivity of the more able (increasing the input sensitiv¬ 
ity difference) is interpreted as increasing b. With r held fixed, totally differ¬ 
entiating the amended equation (5') gives 

F ' iC i^h) * ■«•<'•>(■&) * H ' (9 #) - 

If we rearrange with equations (I) and (4), the shift in AA' becomes 

= [F'(C,) - aC'(I,)\dI s ldb 
r (n,/n u )H'(O u ) + H'(0 S ) 

or the AA' curve shifts to the right: at any given r, the suburban community is 
voting higher inputs and is less attractive to the less able family. Similarly, if 
we hold r constant and totally differentiate equation (6'), BB' shifts to the left 
as b increases. Combining these two shifts shows that, as the more able family 
becomes more input sensitive, the communities become less alike: 9, and r 
rise. 

The analysis is readily repeated to find the effect of increasing the input 
sensitivity a of the less able family: AA' shifts to the left and BB' shifts to the 
right. 
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The Implementation Process of Comparable 
Worth: Winners and Losers 


Peter F. Orazem and J. Peter Mattila 

Iowa State University 


This paper provides a unique opportunity to observe how a public 
policy affected the earnings of various interest groups at different 
stages of implementation. Specifically, we examine how the earnings 
of women, union members, and supervisory and professional staff 
were affected by various proposed and implemented comparable 
worth pay plans in Iowa. We find that large relative gains to women 
in the original proposed plans were reduced as the process evolved. 
As a result, some of the original gains to women were redistributed 
to union members, supervisors, and professionals. 


Despite 25 years of legislation related to equal pay, equal opportunity, 
and affirmative action for women in the job market, women still tend 
to be concentrated in relatively low-paying, predominantly female 
jobs. This perceived lack of rapid progress in women’s labor market 
status has motivated some states and localities to consider or imple¬ 
ment comparable worth as an additional weapon in the battle against 
sex discrimination. Supporters view comparable worth as a method 
for achieving immediate increases in the pay for female-dominated 
jobs, given the apparent ineffectiveness of previous legislation in rais¬ 
ing the pay of women relative to that of men. 

A typical comparable worth policy calls for a study of jobs and pay 
structure within an organization to determine if there has been any 

We would like to thank the Panel on Pay Equity, National Research Council, for 
partial support in funding this study and members of the state government of Iowa for 
their cooperation in providing data and information relevant to the comparable worth 
process. Jeff Greig and Kyle Stephens provided able research assistance. The views 
expressed in this paper are exclusively those of the authors. 
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sex bias in compensation and, if so, to correct the situation. Jobs are,, 
usually analyzed in terms of the skills, effort, responsibility, and work¬ 
ing conditions required. Typically, points will be assigned to each job 
attribute so that a weighted total number of points can ultimately be 
associated with each job. Job classifications having equal (or near 
equal) total points are assigned equal wage rates. Comparable worth 
proposals typically ignore market wages in setting relative pay be¬ 
cause market wages are presumed to embed discrimination against 
predominantly female jobs. Advocates argue that use of the job analy¬ 
sis/factor point method will eliminate sex bias in the pay plan. 

A fundamental problem with such plans is that the valuation of 
different jobs is inherently subjective, meaning that two equally in¬ 
formed, unbiased, and qualified analysts could value the same job 
very differently. The problem becomes more difficult as more and 
more agents with differing agendas are brought into the analysis. 
Because comparable worth pay analysis in the public sector has typi¬ 
cally been conducted with input from various combinations of consul¬ 
tants, politicians, women’s groups, union representatives, supervisory 
staff, and rank-and-file employees, it is clear that the results can be 
influenced by the objectives of the individuals or constituent groups 
involved in the process. The process becomes even more subject to 
external pressure when decisions are made regarding implementa¬ 
tion of the plan. Public-sector budget constraints, market opportuni¬ 
ties for public-sector employees, and union resistance to pay cuts may 
significantly alter the pay plan relative to the initial proposal. 

This paper illustrates how the original goals of a comparable worth 
policy can become diluted as political and economic pressures from 
state budgets, politicians, unions, personnel professionals, market 
forces, and supervisory personnel enter the implementation process. 
We make use of a unique data set from the state of Iowa that allows us 
to examine the earnings structure underlying the initial pay plan, a 
consultant’s initial proposed plan, a plan designed by a steering com¬ 
mittee composed of politicians, a compromise plan implemented after 
negotiations between the state and the union, and the final plan that 
resulted after an appeals process was completed. We are able to show 
how the returns to women, union members, supervisors, and profes¬ 
sionals changed under successive pay plans. The results indicate that 
initial gains to women were ultimately reduced and redirected toward 
constituencies that stood to lose or gain little as a result of the initial 
plan: union members, professionals, supervisors, and those with the 
highest market wages. 

In the next section, we discuss the process of legislating and imple¬ 
menting comparable worth in the state of Iowa. After a discussion of 
our data and methodology, we present our empirical analysis. We 



0 JOURNAL OF POLITICAL ECONOMY 

conclude with a summary of our results and an evaluation of whether 
these results are likely to generalize to other states. 


Comparable Worth in Iowa 

To date in the United States, comparable worth legislation has been 
directed only toward government employees at the state or local level. 
A recent law in Ontario, Canada, extended coverage to private-sector 
firms as well. At least seven states completed or have begun to imple¬ 
ment comparable worth pay adjustments (Connecticut, Iowa, Min¬ 
nesota, New York, Oregon, Washington, and Wisconsin), at least two 
states have completed studies and are deciding whether to implement 
pay adjustments (Michigan and New Jersey), and several other states 
are in the process of studying or are considering a study of pay in¬ 
equities. 1 In addition, a large number of municipal, county, and 
school district governing units have initiated comparable worth stud¬ 
ies or plans (Ehrenberg and Smith 1987). 

This study focuses on the implementation of comparable worth 
plans in Iowa. This process started in 1983 when the Iowa legislature 
voted to fund an initial study of the Iowa Merit Pay System by consul¬ 
tant Arthur Young and Company. A steering committee composed of 
legislators, administrators, and union representatives voted not to use 
market wage survey data in the analysis. In cooperation with the 
consultant, 13 factors (discussed in more detail below) that measured 
various aspects of skill, effort, responsibility, and working conditions 
were defined. Four-person teams of employees and supervisors as¬ 
signed points to each factor for each job to which they were assigned. 
The teams based their evaluations on employee questionnaires. 

Factor weights were obtained using two different methods. First, 
the Arthur Young consultants derived statistical weights based on the 
estimated coefficients computed by regressing pay grade on each of 
the 13 job factors plus a variable that controlled for the percentage 
female in each job. These estimated weights included several that 
were statistically insignificant and three that had small negative values. 
After examining these regression weights, the steering committee 
defined a second set, which we refer to as the committee weights. 

1 Several states in recent years have developed new state pay plans using factor point- 
count methods but have also incorporated market wage survey information in setting 
wage rates for key jobs. Although these state plans (such as in Idaho, Louisiana, Massa¬ 
chusetts, New Mexico, Ohio, and Tennessee) have many features that are similar to 
comparable worth, they deviate to the extent that market wage rates alter relative pay. 
The state of Washington also utilized market wage survey information, but we include 

it in our list of comparable worth states since it has been widely publicized as such and, 
in particular, because the unions had to sue to obtain implementation of the state's 
original intent to make such adjustments. 
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These subjective weights differed in that all were assigned positive 
values and no factor was given a weight below 5 percent or above 15 
percent of the total. 

Two sets of total job points (and hence pay grade plans) were gen¬ 
erated, one for each set of factor weights. Each recommended plan 
would have decreased the pay of some job classifications while in¬ 
creasing the pay of other classifications, thereby helping to limit the 
cost of implementation. The unions, which had been highly suppor¬ 
tive up to this point, became resistant when pay cuts became a possi¬ 
bility. From the state’s perspective, elimination of the pay cuts would 
have raised the cost of implementation to an unacceptable level. 
These issues became part of the contract negotiations between the 
state and the unions in 1984. The compromise plan settlement was 
that no one suffer a reduction in pay grade and that the size of all 
increases be reduced by one pay grade and one step. This plan was 
implemented (and extended by the governor to noncontract employ¬ 
ees) in March 1985. 

After implementation, an appeals process was put into effect to 
hear complaints concerning the comparable worth adjustments. Non¬ 
union appeals were heard by a panel of five personnel professionals, 
while union appeals were heard by a joint union-management panel. 
Forty-five percent of the merit system pay recommendations were 
appealed, with roughly equal proportions receiving increases, de¬ 
creases, and no change on appeal. The final settlement of a new 
round of union contract negotiations provided implementation of the 
recommended pay increase in full in July 1987 but canceled all rec¬ 
ommended pay cuts. We estimate that, in total, the final comparable 
worth system increased annual state payrolls by $26.2 million in 1983 
dollars or 8.8 percent of the original payroll. 


Methodology and Data 

Our objective is to isolate the effect of alternative comparable worth 
pay plans on the structure of earnings. To do this, we must not allow 
other factors to change that would also alter the pay structure. It 
would be inaccurate, for example, to compare earnings functions 
estimated over samples of workers employed before implementation 
and after implementation. Because the new pay system may cause 
some employees to quit, others to transfer to different jobs, and still 
others to enter state employment, such comparisons of snapshots of 
the state pay structure at different points in time will be subject to 
sample selection bias. Second, changes in other exogenous influences 
over time such as political elections, shifts in public demand for gov¬ 
ernment services, or changes in government revenue could also alter 
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the pay structure. Such coincident influences on employee earnings 
would render difficult any derivation of the comparative static effects 
of comparable worth. Finally, our objective is to illustrate how the 
proposed structure of earnings evolved over time. But not all pro¬ 
posed plans were implemented. Thus we require a methodology that 
allows us to analyze both earnings structures that were implemented 
and those that were never adopted. 

We resolve these potential problems by holding constant a Decem¬ 
ber 1983 sample of state employees and then observing how their pay 
would have changed (in 1983 dollars) as a result of each of the pro¬ 
posed or implemented comparable worth plans discussed above. That 
is, given an individual’s 1983 job classification and pay grade, we are 
able to compute the number of pay grades that his or her job would 
have increased or decreased given each comparable worth proposal. 
Using the December 1983 pay schedule, we then compute what an 
individual's pay would have been in each case. We compute five dif¬ 
ferent earnings rates for each employee: (1) the actual 1983 earnings, 
(2) the earnings associated with the recommended Arthur Young 
plan using the statistical weights, (3) the earnings associated with the 
recommended steering committee plan using the committee weights, 
(4) the earnings rate associated with the implemented state/American 
Federation of State, County, and Municipal Employees (AFSCME) 
compromise plan, and (5) the earnings rate associated with the imple¬ 
mented state/AFSCME appeals plan. We are therefore able to analyze 
all plans, whether implemented or not, avoiding biases associated with 
sample selection or coincident changes in exogenous variables other 
than comparable worth. 

We use the standard earnings function approach pioneered by 
Mincer (1974) to relate earnings to a set of human capital and individ¬ 
ual characteristics according to 

In W* = a* + b k S, + c*U, + d k P, + e k In MW f + f k X, + e,*, (1) 

where In VV,* is the natural logarithm of individual i's biweekly earn¬ 
ings under pay plan k, S, is a dummy variable that takes the value of 
one if the incumbent is female, U, is a vector of union status variables, 
P, represents dummy variables for professional and supervisory posi¬ 
tions, In MW, is the natural logarithm of the median market wage for 
the occupation, and X, is a vector of other human capital and personal 
characteristics commonly used in earnings functions. These variables 
are defined more precisely below. The parameters a*, b k , c k , d k , e k , and 
/* are specific to pay plan k, and e,* is the error term. These parameters 
may be compared across pay plans to determine how returns to the 
various characteristics change across pay plans. 
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More directly, we could estimate the change in returns across pay 
plans by estimating the equation 

In W, k - In IV ,0 = (a k - a 0 ) + (b k - b 0 )Si + (c k - c 0 )U, 

+ {d k - d 0 )Pi + (e„ - e Q ) In MW, (2) 
+ (/* ~ /o)X, + (e,* - z i0 ), 

where the zero subscript represents the original 1983 pay structure. 
The estimated coefficients in (2) measure the change in the return to 
the various characteristics under pay plan k relative to the original 
1983 pay plan. A positive coefficient implies that the characteristic 
“wins” relative to other characteristics as a result of comparable 
worth, while a negative coefficient means that it “loses.” Estimation of 
(1) and (2) will allow us to observe how sex, union status, professional 
and supervisory status, market wages, and other characteristics 
change in importance and value in each successive pay structure. 

Our data set consists of a random sample of 3,734 state government 
employees in Iowa as of December 1983. This was roughly one-fifth 
of total state merit employment. 2 Personal characteristics as well as job 
classifications, biweekly pay, supervisory/professional status, union 
contract coverage, union dues checkoff, and employment time with 
the state came from a December 1983 payroll tape. Educational at¬ 
tainment, licensing, vocational training, military experience, and non¬ 
state work experience were culled from state personnel record files. 
We used (generally private-sector) wage survey data published by the 
Job Service of Iowa (1984) to measure median occupational market 
wages., We compared job descriptions used in the Job Service survey 
with job descriptions used by the State Merit Employment Depart¬ 
ment in order to match jobs as closely as possible. 

Each employee’s actual biweekly earnings rate during December 
1983 was extracted from the payroll tape. 3 Given the individual’s job 
classification, we used tables supplied by Arthur Young and Company 
(1984) (for the statistical and committee plans) and by the state (for 
the 1985 and 1987 plans) to infer how the individual's pay grade 
would have changed under each of the comparable worth plans. 
These pay grades were translated into biweekly earnings rates using 

2 Employees of the state universities and a small number of nonmerit employees (not 
subject to Civil Service exams and procedures) were excluded from the Iowa compara¬ 
ble worth process. 

* There is no need to adjust for the value of fringe benefits or for cost-of-living 
differentials since all state employees receive the same benefits and since the vast 
majority of employees live and work in central Iowa, in the Des Moines-Ames met¬ 
ropolitan area. 
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the December 1983 pay plan tables that were also supplied by the 
state. See Orazem and Mattila (1988) for further details on the data 
set and the measurement of the variables. 


Results 

The Female-Male Pay Gap 

Means and standard deviations of the variables are reported in table 
1. Most of the explanatory variables are binary variables, although 
education, tenure, experience, and years out of the labor force are 
measured in years. The market wage is shown in dollars in table 1, 
although entered in log form in the regressions so that its coefficient 
can be interpreted as an elasticity. 

Summary statistics are also shown separately for men and for 
women, who represented 49.9 percent of the sample. 4 In 1983, wom¬ 
en’s earnings averaged 78.7 percent as much as those of men. Each of 
the comparable worth plans would have increased the typical wom¬ 
en’s pay more than men’s pay. Indeed, the plan based on the statistical 
weights would have reduced the average men’s biweekly pay by 
122.20 and raised women’s pay to 88 percent of the average men’s 
pay. However, as a result of implementation and the appeals process, 
no one's pay was cut, and women ended up gaining $68.99 biweekly 
(12.3 percent) while men gained $42.85 biweekly (6.0 percent). As a 
consequence, average female earnings rose to only 83.4 percent of 
average male earnings. Overall, the net effect of comparable worth 
was to increase women’s relative pay by less than five percentage 
points. 

At this level of aggregation, it appears that women did gain as a 
result of comparable worth. However, economists tend to be more 
interested in the size of the female-male earnings gap after control¬ 
ling for human capital and personal characteristics. 5 Table 2 provides 


4 In 1983, 47.3 percent of state government employees were female. By comparison, 
42 percent of Iowa private-sector and local-government employed were female, ac¬ 
cording to 1980 census data. 

n We exclude the 13 original |ob factors and the total faclpr points from the regres¬ 
sion so as to focus on the pure effect of the personal and human capital variables. 
Variables such as supervisors, education, and experience have corresponding factors 
that compete to explain the same effects. Also, as we argue in this paper, various 
groups such as unions, professionals, and supervisors were able to influence the job 
factors and the factor-weighting process so that factor points are also a function of sex. 
union status, professional status, and supervisory status. Thus, e.g., the full impact of 
union status on the pay structure is the direct impact (through negotiation) and the 
indirect effect through' potential influence on factor weights and measurements. For 
tests of the sensitivity of our results to other specifications, see Orazem and Mattila 
(1989). 
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TABLE 3 


Ratio of Female to Male Earnings 



1983 

Arthur 

Young 

Statistical 

Arthur 

Young 

Committee 

1985 

Compromise 

1987 

Appeal 

Uncorrected 

Corrected (including 

.787 

.880 

.870 

.829 

.834 

market wage)* 
Corrected (excluding 

.951 

1.009 

.996 

.964 

.971 

market wage)* 
Corrected (as direc tly 

.877 

.986 

.957 

.913 

.916 

implied by table 2) 

.961 

1.049 

.989 

.968 

.975 


♦The ratio 11 computed a* the average female wage divided bv the predicted female wage under the male 
earning! mructurc The latter was computed hy first estimating eq (1) over the sample of male incumbents lor each 
plan, including and excluding the raarkei wage, respectively Hieir coefficient represem die male earnings struc¬ 
ture. The summation of the average female characteristics multiplied by then respective coefficients from the male 
earning! structure is a commonly used estimate ol what women would earn if their charat terisiics were rewarded at 
the same rale as men. 


regression estimates of equation (1) in which we control for standard 
measures of education, training, work experience, marital status, and 
race, as well as the market wage. After controlling for these variables, 
we see in column 1 that women were underpaid by only 3.9 percent 
relative to men in 1983. This coefficient is sensitive to inclusion of the 
market wage variable. When equation (1) is reestimated excluding the 
market wage, women earned 12 percent less than men. 

For our purposes, what is more important is how returns to women 
change from one pay plan to the next, not the magnitude of the 
differential. In table 3, we report the estimated ratio of female to male 
wages both controlling and not controlling for market wages. The 
highest relative female earnings occur in the statistical plan. However, 
later revisions and compromises tended to dissipate these gains for 
women. This general pattern of reductions in the relative gains to 
women is not altered by the inclusion or exclusion of market wages. 
Overall, the results suggest that the implemented plan reduced the 
unexplained pay gap between men and women by 32—40 percent, 
whereas the original proposed statistical plan would have virtually 
eliminated the pay gap. 


The Arthur Young Statistical Weight Plan 

The original proposed pay plan devised by the Arthur Young consul¬ 
tants weighted the job factors on the basis of coefficients derived from 
a regression of pay grades on measured job characteristics. Of the 
four comparable Worth plans that we consider, this plan was least 
subject to political forces. Although based on the same 13 job factors 
as the other plans, its weights were determined in a “scientific” man- 
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ner. The statistical plan proposed cutting pay fpr some jobs while 
raising pay for others. 

The impact of the statistical plan may be analyzed either by com¬ 
paring columns 1 and 2 of table 2 or directly by focusing on the pay 
difference coefficients of column 6. In either case, women would have 
enjoyed a substantial 8.8-percentage-point gain in pay on average. 
Women having started 3.9 percent behind men under the 1983 pay 
plan, the statistical plan would have given women a 4.9 percent ad¬ 
vantage over men, other things constant. 

It is notable that the earnings of unionized workers would have 
deteriorated. Those covered by a union contract would have lost 3.8 
percentage points in pay. Those who were dues-paying members 
(with professional associations excluded) would have lost an addi¬ 
tional 4.8 percent in pay, for a total loss of 8.6 percent. One might 
have expected these effects if unions were disproportionately male. 
However, referring back to table 1, we see that men and women were 
almost equally likely to be covered by a union contract. On the other 
hand, men were somewhat more likely to be dues-paying members of 
unions than women were, while equal proportions of each sex paid 
dues to a professional association. 

Men are more likely to be professionals and supervisors than 
women are. As seen in table 2, professional employees would have 
lost 1.9 percent of earnings, other things constant, under the statisti¬ 
cal plan. Supervisors would have neither gained nor lost any earnings 
as indicated by the statistically insignificant coefficient for pay differ¬ 
entials. 


The Committee Weights Plan 

On receiving the consultants’ proposed plan, the steering committee 
devised its own set of weights. Job factors such as physical demands, 
working environment, mental/visual demands, unavoidable hazards/ 
risks, and work pace/pressures and interruptions that had negative or 
near-zero coefficients in the statistical plan were given positive 
weights. To the extent that these characteristics were closely associ¬ 
ated with blue-collar and clerical jobs, this could be expected to raise 
the pay of many workers covered by union contract. In fact, the 
committee explicitly took into account how the factor weights influ¬ 
enced outcomes, including how they affected relative pay for female 
jobs, in revising the factor weights. 6 By setting the weights partly on 


6 The Arthur Young report (1984, p. SO) states that “upon reviewing the results of 
the statistical analysis, the committee determined that the preliminary weights again 
needed to be refined. The Steering Committee established, as their policy, a final set of 
weights for each factor. In making their determination they considered the different'. 
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the basis of their impacts on outcomes, the committee allowed the 
possibility that favoritism toward a constituency could enter the pay 
plan. 

As seen in table 2, for whatever reasons, the losses that would have 
been inflicted on unionized workers under the statistical plan were 
eliminated under the committee plan. The pay difference coefficients 
(col. 7) show that unionized workers would have made small positive, 
although insignificant, gains under the committee plan relative to the 
original 1983 pay plan. At a minimum, the committee plan was neu¬ 
tral toward union jobs. 7 

In the original Arthur Young study, the job factor most closely 
associated with supervision (supervision exercised) had a regression 
coefficient that was essentially equal to the regression coefficients for 
two other factors (impact of errors and guidelines/supervision avail¬ 
able to the workers). The committee plan assigned an 8 percent 
weight to supervision exercised while leaving the other two factors 
with 5 percent weights. The committee plan raised supervisors’ pay by 
3.7 percent relative to the 1983 pay plan. This plan also restored 
professionals’ pay to its original relative level, eliminating a 1.9 per¬ 
cent cut in the statistical plan. 

Although it may not have been the committee’s objective, the net 
result of these changes was to greatly reduce the relative gains for 
women. As opposed to the 8.8 percent gain under the statistical plan, 
women would have gained only 2.8 percent relative to 1983 pay 
schedules under the committee plan. One way of interpreting this is 
that the committee plan shifted the gains toward unions, supervisors, 
and professionals and away from women. Of course, we should keep 
in mind that the regression results highlight the relative gains with 
other variables held constant. In fact, the committee plan raised av'er- 
age pay for both men and women (as seen in table 1). In other words, 
the committee plan achieved less equalization of pay between men 
and women than the statistical plan, but at a much higher total cost to 
the state. 

The 1985 Compromise Plan 

Neither of these plans was ever implemented. Instead the state and 
AFSCME negotiated a compromise pay plan. The major compromise 


impacts on male and female jobs, . . . the statistically derived weights for predicting 
current grade levels, and the ways the factors actually acted in determining the final 
point totals for all jobs." 

Even though the committee plan increased average pay, it reduced the standard 
deviation of pay acrossall jobs. This pattern is also typical of the impact of unions on 
income distributions. SSe Freeman and Medoff (1984) for a discussion of the effect of 
unions on income inequality. 
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was a reduction in the size of the comparable wprth pay increases in 
return for no pay cuts. Both features would be expected to reduce the 
size of the gains to women since female-dominated jobs increased less 
and male-dominated jobs avoided cuts. Our results confirm this: the 
1985 compromise plan increased female earnings by only 0.7 percent¬ 
age points relative to the 1983 plan. Women did gain relative to men, 
but only by a small fraction of the potential gains that would have 
been made had either of the earlier proposals been implemented. 
Once again, this should be interpreted in a relative sense. As seen in 
table 1, in absolute terms, average female pay increased on implemen¬ 
tation more than it would have under the statistical plan but increased 
less than under the committee plan. 

In relative terms, if women gained less, then other interest groups 
must have gained more. Our results in table 2 suggest that unionized 
workers, professionals, and supervisors gained. Those covered by 
union contracts and those paying union dues gained relative to all 
prior plans. Dues-paying workers covered by union contracts enjoyed 
2.5 percent increases in earnings overall. Professionals gained relative 
to all preceding pay plans. It appears that these heavily male job 
classifications may have benefited from the elimination of pay cuts, as 
did unionized workers. Supervisors also gained relative to 1983 and 
to the statistical plan, although not relative to the committee plan. 

The 1987 Appeals Plan 

Of the 798 merit pay job classifications that existed in 1985, 363 were 
appealed. About one-third of the appeals resulted in an increase in 
pay grade. Although 28 percent of the appeals resulted in a proposed 
pay reduction, no reductions were implemented. Therefore, the ap¬ 
peals plan is the same as the compromise plan except for the imple¬ 
mented pay increases. In addition to the appeals, a relatively small 
number of high-level job classifications received comparable worth 
increases in 1987 after implementation had been postponed in 1985. 

We were uncertain about what impact the appeals process would 
have. On the one hand, complaints from men and management con¬ 
cerns about meeting market wages could move the pay plan back 
toward the original 1983 pay structure. On the other hand, com¬ 
plaints from women and conunuing inequities could move it toward 
additional gains for women. 

Our results indicate that, on net, women did make some small 
additional gains as a result of the appeals process. Recall that the 1985 
compromise plan left women with only a 0.7 percent gain relative to 
men. Had the appeals recommendations been implemented in full, 
including pay cuts, we calculate (not shown) that women’s gains would 
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have totaled 2.0 percent. However, this was rolled back to a 1.4 per¬ 
cent gain. 

Unions also gained from the appeals process. Dues-paying mem¬ 
bers of professional associations enjoyed the largest gain, a total pay 
differential of 7.3 percent. Having union representatives on their 
appeals committees appears to have been useful. There was little 
change in the relative pay of supervisors and professionals. 


Relative Gains or Losses to Other Characteristics 

Although comparable worth plans purport to ignore market wages in 
establishing pay, market forces still influence the pay structure in each 
pay plan. Moreover, the statistical plan, the plan least influenced by 
constituent pressure or compromise, reduced the influence of market 
wages the least. The committee plan, on the other hand, reduced the 
influence of market wages by 57 percent relative to the original 1983 
pay structure. Thereafter, the implemented compromise plan and 
the appeals process reintroduced the influence of market wages. At 
least in Iowa, it appears that market forces did influence the outcome 
of the pay analysis process once it came time to actually implement the 
plan, and again as the appeals process allowed further pay adjust¬ 
ments over time. 

Another interesting question has been the impact of a comparable 
worth policy on the pay for minorities. Iowa has a disproportionately 
small population of minorities, and state employment reflects that. 
Only 2 percent of our sample is classified as minority (black, Hispanic, 
or American Indian). Nevertheless, the impact of the various plans 
was to reduce relative pay for minorities by 1 percent in all_pay sys¬ 
tems. Both the small population of minorities and the low marginal 
significance level of the coefficients (it is significant at the 10 percent 
level but not the 5 percent level in the implemented plans) suggest 
caution in generalizing this result to other states. 

A final interesting effect of the Iowa comparable worth process was 
the relative treatment of educational degrees relative to additional 
years of education. There appears to be a clear pattern of increasing 
the returns to credentials or threshold levels of education, 8 The coef¬ 
ficients on dummy variables signifying the attainment of master’s, 
doctoral, and vocational degrees and occupational licenses are all pos¬ 
itive and generally significant in the pay differences equations. Recipi¬ 
ents of Ph.D. s and holders of occupational licenses made particularly 

In other words, there seems to be a shift in relative rewards to education toward the 
type of returns emphasized in the screening literature (Spence 1973). 
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significant relative gains. On the other hand, marginal increases in 
human capital as measured by years of education, years of job tenure 
with the state, and years of job experience prior to entering state 
employment either decreased in relative returns or had no change. 

This relative deemphasis in marginal returns to human capital in¬ 
vestment accompanied perhaps by an increase in returns to specific 
threshold levels of education is likely to occur in other settings. First, 
because job pay is set in regard to the minimum requirements for 
successful completion of the job, additional education beyond the 
minimum is likely to be deemphasized. Second, because the plans are 
designed to emphasize factors that previously were not being re¬ 
warded in the pay structure, the factors that had been given impor¬ 
tance in the original pay plan (including tenure, experience, and edu¬ 
cation) must fall in relative importance. 9 


Conclusions 

Our major conclusion is that the ultimate impact of comparable worth 
on the wage structure of state employees in Iowa was greatly modified 
by various interest groups through the political process of legislation 
and implementation. Potential gains of 8.8 percentage points in fe¬ 
male pay relative to male pay under the original statistical plan ended 
up as a gain of only 1.4 percentage points once comparable worth was 
fully implemented and appeals were resolved. 

Through a series of modifications to the plan and through collec¬ 
tive bargaining compromises, other interest groups such as unions, 
supervisors, and professionals were able to avoid potential losses in 
pay that would have accrued under the original statistical plan. In¬ 
deed, these groups ended up with relative pay increases. Dues-paying 
workers covered by union contracts converted potential relative losses 
of 8.6 percent to actual relative gains of 3.4 percent in earnings. 
Similar though smaller gains were made by professionals and super¬ 
visors. Regardless whether we interpret this as a defensive reaction to 
protect their incumbents from economic loss (Hirsch and Addison 
1986) or as rent-seeking activity designed to enhance their income 
(Buchanan, Tollison, and Tullock 1980), the net impact was to shift 
gains away from women toward these interest groups. The bottom 
line is that although women gained, they would have gained much 


9 O’Neill, Brien, and Cunningham (1989) report lower returns to education and job 
experience in the Washington State comparable worth plan. In contrast with our re¬ 
sults, they do not find increases in returns to threshold levels of education. 
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more (given the ultimate state outlay) had the original “formula” not 
been modified. 10 

One may question how general our findings are since our data 
relate to one state and one comparable worth process. A final judg¬ 
ment must await further research. However, on the basis of our in¬ 
quiries in other comparable worth states, we find that many of the 
same phenomena are at work. 11 In all states, influential legislators or 
governors played a key role in the early stages along with sympathetic 
high-level managers. Most oversight committees contained super¬ 
visors, especially from the personnel department, and all contained 
legislators except in New York. 

Consultants were hired in the early stages in all states to analyze pay 
discrimination and to conduct a job factor analysis leading to propos¬ 
als for change. Although contracts were divided among Arthur 
Young and Company, Hay Associates, Halcrest-Craver, and Willis 
Associates, they all tended to analyze jobs on the basis of skills re¬ 
quired, effort or major demands, responsibility, and working condi¬ 
tions and used a point-count method Then, after examining the 
outcomes, the state typically altered the consultants’ proposals. For 
example, in New York, working conditions were deleted, a different 
set of factors was used, and new weights were computed. In Min¬ 
nesota, New Jersey, and Oregon, additional points or weights were 
added to certain job factors. In Connecticut, the appropriate points 
were negotiated with each of several unions. All states ignored market 
wages in making comparisons between male and female jobs (except 
Washington, as discussed in n. 1). 

Unions have played an important role in all states. Union contracts 
called for comparable worth studies at an early stage in Minnesota 
and New York. Union representatives were appointed to oversight 
committees in every state except New York. Collective bargaining 
negotiations either determined the size of the compromise or else 


10 h should be emphasized that the goal of comparable worth is to raise the relative 
pay of female jobs, and, in principle, this may be done without raising the total payroll 
cost to the state provided that (a) pay is cut in a sufficient number of male-dominated 
jobs to offset the increases and (b) these jobs remain at or above competitive private- 
sector rates after the cuts. The latter may be possible given Smith’s (1977) analysis. She 
concluded that federal workers were paid well above, state workers somewhat above, 
and local-government workers slightly above private-sector workers. An increase in 
total payroll cost need arise only because of political opposition to pay cuts. 

This discussion relies heavily on telephone interviews with and documentation 
supplied by individuals who play a key role in their state's comparable worth process. 
We surveyed the states of Connecticut, Michigan, Minnesota, New Jersey, New York, 
Oregon, Washington, and Wisconsin most intensively since they have the most experi¬ 
ence with comparable worth. In addition, Massachusetts has followed a similar pattern, 
even though its pay phtn has taken market wages into account. See the individual state 
publications listed in the references. 
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facilitated it by setting aside money in Connecticut, New York, and 
Oregon. Unions were consulted so that implementation would coin¬ 
cide with new contracts in Minnesota. In Washington, a legal suit 
brought by the unions was dropped only after (noncontract) negotia¬ 
tions with the unions induced the state to implement its comparable 
worth plan. In Michigan, a union-filed suit charging the state with sex 
discrimination is now under appeal but appears to have played an 
important role in motivating study and some initial pay adjustments. 

Comparable worth has not led to a reduction in pay grade or pay 
rate in any state except in New York and Washington. In those two 
states, only a very small number of jobs have had cuts in grade, with 
assurances to incumbents that they will not be cut at all (in New York) 
or not cut for 6 months (in Washington). Very few employees appear 
to be affected adversely. 

All state personnel departments have played an important role in 
doing the job analysis and setting up the new pay plan. Typically the 
consultant plays only an advisory role after completing an initial 
study. In all states, factor ratings were based, in part, on question¬ 
naires filled out by incumbents and, in part, on committees of super¬ 
visors and professionals, especially from the personnel departments. 
Union representatives were involved in some cases. All states also 
have provided that appeals of job evaluations may be made, although 
typically the appeals go back to the same personnel analysts who made 
the initial decisions. 

While each state has its own unique history and political features, 
the patterns of participation by the major actors (unions, consultants, 
personnel specialists, legislators, and supervisors) are sufficiently sim¬ 
ilar to allow the conjecture that our major conclusions apply in other 
states. That is, we expect that women have gained less than originally 
proposed as other groups such as unions, supervisors, and profes¬ 
sionals protect their interests and capture parts of the gains for them¬ 
selves. At a minimum, this study has given a unique perspective on 
how political and economic forces affect the evolution of public policy 
both in general and in the context of comparable worth. 
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The Rational Nonpurchase of 
Long-Term-Care Insurance 


Mark V. Pauly 

University of Pennsylvania 


Only a tiny fraction of the nonpoor population currently purchases 
private insurance coverage against long-term-care (LTC) costs. Stud¬ 
ies generally attribute the failure to purchase private coverage to 
“unawareness" by potential purchasers of the benefits of coverage 
and a misperception that Medicare currently covers long-term care. 
I explore alternative reasons for failure to purchase coverage by 
well-informed, expected utility-maximizing risk-averse individuals 
for whom LTC is associated with a large increase in mortality and for 
whom family members represent an alternative source of care. 
There may be no demand for LTC insurance even if it is made 
available at actuarially fair premiums because the main consequence 
of coverage is to enhance the expected value of one’s estate. 


I. Introduction 

There is very little private insurance protection against the cost of 
long-term care (LTC) in the United States. Only about 2 percent of 
nursing home costs were covered by private insurance in 1986 (Divi¬ 
sion of National Cost Estimates 1987). This is surprising since annual 
LTC costs in a nursing home are estimated to be about $22,000 per 
year for adequate quality care (Task Force on Long Term Health 
Care Policies 1987) and can be higher, while the annual likelihood 
that an elderly person will use a nursing home is relatively low, less 
than one in 10 for all elderly and still less that one in four above age 
85 (Hing 1987). In contrast, nearly 70 percent of the elderly have 
purchased “Medigap” insurance coverage, which provides protection 
against the deductibles and copayments in the public Medicare policy 
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(Scheffler 1988). The likelihood that some Medigap claims will be 
made in any year is high. 

We seem to have here a familiar paradox in market insurance pur¬ 
chasing. The elderly fail to buy coverage against high-loss, low- 
probability events and yet do seek coverage against high-probability, 
low-loss events—exacdy the opposite of rational insurance purchas¬ 
ing. Are there rational reasons for this seeming irrationality? 

It is not hard to understand the rationality of purchase of Medigap 
coverage. Medigap is subsidized, in effect, because its purchase, 
though triggering higher Medicare benefits, does not add to the 
Medicare premium the individual pays. Nor is it hard to understand 
why low-income elderly do not purchase nursing home coverage. In 
all states the public Medicaid program provides nursing home cover¬ 
age once a family's wealth falls below a certain level. Any private 
insurance benefits must be used before Medicaid will pay. It is easy to 
see that Medicaid, as a comprehensive insurance policy with a deduct¬ 
ible equal to one’s wealth, provides a close substitute, at a zero price, 
for private insurance coverage for low-wealth people. 

What is most puzzling is why middle-class elderly, who typically do 
have some wealth to protect and who are the most frequent pur¬ 
chasers of Medigap coverage, fail to buy LTC insurance, even when 
the chance is low that they will spend down to Medicaid eligibility. 
One possible explanation is, of course, the phenomenon Kunreuther 
(1978) has noted in other insurance markets: a tendency to ignore 
low-probability, high-loss events that have not occurred recently. 
However, this sort of behavior has not been so common in health 
insurance (Hershey et al. 1984). In the extensive policy discussion of 
this issue that has occurred, the most common explanation is that the 
elderly are misinformed. A majority of the elderly, according to sur¬ 
veys, are under the mistaken impression that Medicare already pro¬ 
vides long-term nursing home coverage (American Association of 
Retired Persons 1985). And even those knowledgeable about the limi¬ 
tation of Medicare are alleged to lack awareness of the probable need 
for LTC services. Indeed, the report of the federal Task Force on 
Long Term Health Care Policies (1987, p. 29) relies almost entirely on 
“lack of awareness” to explain what it terms “lack of demand." The 
comprehensive treatment by Davis and Rowland (1986, p. 91), in 
addition to discussing “underestimation of need” by nonpoor elderly, 
points to pricing problems, moral hazard, and adverse selection as 
reasons why private insurance is unlikely to “become a dominant 
force in the near future.” But it alleges that “the purchase of private 
insurance to protect against impoverishment in a nursing home 
would appeal to some people” and that coverage in noninstitutional 
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settings is what most people would prefer. Finally, the Technical' 
Work Group on Private Financing of Long-Term Care for the Elderly 
(1986) likewise attributes the small size of the current market for 
private insurance to the high cost of individual insurance and the 
emphasis on institutional care benefits, despite studies indicating that 
about a quarter of the elderly could “afford” insurance even at cur¬ 
rently feasible premiums. 

In this paper, I shall argue that there are other potentially impor¬ 
tant impediments to private demand for LTC insurance, impedi¬ 
ments that would exist even if the insurance were offered at fair 
premiums. Even without loading and adverse selection, these impedi¬ 
ments could well lead to very low insurance purchases even in mar¬ 
kets in which risk-averse buyers are rational and appropriately in¬ 
formed. 

The explanation I offer is one that takes into account the special 
features of chronic illness insurance and integrates it into a model of 
lifetime expected utility maximization. I show that the rational risk- 
averse individual may well choose to leave most if not all of his or her 
LTC expenses uncovered by insurance. Particularly if only conven¬ 
tional insurance that offers benefits based on contemporaneous med¬ 
ical care costs is offered, utility-maximizing behavior may well involve 
little or no insurance. This explanation does not depend on the exis¬ 
tence of transactions costs, adverse selection, or inaccurate beliefs 
about the extent of Medicare coverage, which others have discussed 
(e.g., Friedman and Manheim 1988). I further show that there are 
some special types of insurance contracts that might be salable for 
LTC. Bqt even in this case, I speculate that there are some intrafamily 
interactions that may inhibit the purchase of coverage. 

I do not imagine that even these explanations can fully explain why 
private LTC insurance is very limited; they do permit LTC insurance 
to be rational in some circumstances. In addition to indicating what 
the circumstances conducive to coverage are, the discussion shows 
that the market for LTC insurance for the elderly is likely to remain 
relatively small, though probably not so small as it is at present. 1 also 
consider briefly whether there is a rationale for public subsidization of 
LTC insurance for the nonpoor, if the reasons for its current non¬ 
purchase are as I have outlined. 

II. What Does Long-Term-Care Insurance 
Protect? 

I begin with a simple model of the illness process associated with 
chronic care. 1 assume that there are two types of illness, chronic and 
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acute. Medical care does not itself yield utility. Moral hazard is ruled 
out, and it is assumed that there is a unique quantity that constitutes 
appropriate “nursing home” care in the event of chronic illness. 

Conventional health insurance does not cover long-term care for 
chronic illness. Instead, it covers medical expenses associated with 
acute illness. The individual who suffers an acute illness requires 
costly medical care; if this care is consumed, he or she has a high 
probability of recovering to normal functioning. The cost of acute 
illness can be viewed as a once-and-for-all reduction in the disposable 
income available for the future consumption the person truly values. 
Chronic illness, in contrast, is not cured. One way to represent its cost 
is to imagine that its main effect is to reduce the individual’s capacity 
for normal functioning and household production. In addition, data 
suggest that elderly people with illnesses who enter a nursing home 
have much lower life expectancies compared with either those elderly 
who are not ill or those who have only acute illnesses, other things 
equal. 

Let us first consider a simple case in which long-term or chronic 
illness implies a fixed expenditure per year of $X and from which 
there is no recovery or improvement. While the assumption of no 
improvement from a chronic illness is not strictly true, it is the case 
that less than 25 percent of the elderly admitted to nursing homes are 
discharged to their homes or families (Sekscenski 1987). We may also 
reasonably assume that chronic illness implies a substantial reduction 
in life expectancy. For example, the annual mortality rate for 80-year- 
old women is about 6 percent overall but was 27-30 percent among 
elderly of the same average age who were candidates for formal long¬ 
term care in a large-scale demonstration project (U.S. National Cen¬ 
ter for Health Statistics 1986; Applebaum et al. 1988). 

The person is_assumed to be without a spouse 1 and to have some 
money wealth W at the beginning of the planning period. In this 
initial case the person is assumed as well to have no other heirs and to 
make all decisions about insurance or care purchases himself or her¬ 
self. We represent the expected lifetime utility function (EU) by 

ii n 

E U = X A'(C,) + Z P‘ D '' 

'=t i 

where pf is the probability of surviving to period t in the healthy state, 
C, is dollars of consumption in period t, p] is the probability of surviv¬ 
ing in the sick state, H is the maximum length of life, and U‘ is the 
utility level if one is sick with chronic illness and consuming $X worth 

It is estimated that approximately 84 percent of elderly nursing home residents are 
without spouses (Hing 1987). 
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of care per time period. In the sick state, all desired consumption is.> 
assumed to be furnished by the payment X. 

If perfect insurance markets are available, the lifetime expected 
utility maximization problem (from t = 1 onward) is to choose C, in 
order to maximize EU subject to 

H H 

»=i /=i 

where W is initial wealth. The solution to this problem will be to use W 
to purchase an annuity, but an annuity that pays $X per time period if 
one is sick and $C, if one is well. 

However, such perfect annuity markets do not exist. It is more 
realistic to analyze a case in which no annuities are available. Suppose 
that ^ s SX, where S is the maximum number of periods the person 
will survive if sick; the individual initially has enough wealth to be able 
to pay his maximum LTC costs. 2 We capture the notion that the 
person is unlikely to be eligible for Medicaid by assuming that SX is 
small relative to W. The maximand remains the same, but the budget 
constraint then becomes W a 2/t'j s C, + SX. 

If the person does survive long enough so that his wealth in period t 
falls to the level at which the constraint W, > SX is not satisfied (i.e., if 
there is a possibility that nursing home expenses might exhaust one’s 
wealth) and if the person will still consume $X per year when chronic 
illness strikes, both the bankruptcy laws and the Medicaid program 
operate to ensure that utility in the illness state does not slip below 
That is, if the individual will receive $X of care no matter what and if 
his estate cannot be negative, then, at worst, it is as if W - SX = 0. 

In this situation there will be no demand for nursing home insur¬ 
ance, even if it is offered on an actuarially fair basis. The reason is 
obvious: insurance premiums for coverage against X in any period 
will reduce C. Coverage will add to the bequest that would be left if 
the person dies after a chronic illness if wealth in any period exceeds 
SX; however, in this model bequests offer no utility. If wealth falls 
below SX, private coverage substitutes for Medicaid. There is no de¬ 
mand for private insurance even if we assume that the persop is risk 
averse and the occurrence of chronic illness is a random event. Insur¬ 
ance is not bought because the marginal utility of an additional dollar 
in the (lifetime) chronic illness state has been defined to be zero. 

If the individual obtains no utility from bequests, Kodikoff and 
Spivak (1981) have shown that, with annuities unavailable, planned 

! In practice, S would vary with age. 

s The Medicaid program pays for all nursing home care once the person has "spent 
down” wealth to approximately zero. 
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consumption declines with age and the expected bequest is positive. 4 
Even with no utility from bequests, the individual’s bequest will be 
relatively large at young ages and then decline with age, equaling zero 
at age H. 

III. Variable Nursing Home Quality 

There might be private demand for insurance if the “quality” of 
nursing home care is variable. Instead of X being a fixed level of 
annual expenditure for a nursing home stay, one might imagine that 
the desired level would be an increasing function of initial wealth, so 
that X = X(W). The level of spending provided by Medicaid is fixed 
at X. 

Formally, we can simply substitute X for X in the maximand and the 
constraint of the previous problem and solve for the optimal pattern 
of C, and X,, both conditional on the state of health. In this case, 
however, as the person ages, there may be demand for insurance to 
finance the difference between X and X. However, buying insurance 
coverage that pays for just this amount, as a supplement to Medicaid, 
is not permitted by the Medicaid program. That program takes prior 
private insurance coverage into account before paying Medicaid ben¬ 
efits and will in any case pay no more than X. Consequently, the 
person who desires a greater level of X must be prepared to forgo any 
Medicaid benefits. 

Insurance will be purchased if the expected utility level with insur¬ 
ance is greater than the expected utility level without insurance. Con¬ 
sider the effect on expected utility of buying insurance that will pay 
some amount X > X. As before, with no marginal value attached to 
bequests, there will be no demand for insurance for periods in which 
wealth at the beginning of the period exceeds X. This means that if 
insurance is to be purchased, it will cover the last years of life. Let us 
therefore consider the purchase of insurance in the last year of life (< 
= H) and let us assume that the wealth level Wn (which will all be 
spent on consumption C H if the person does not use a nursing home) 
is less than X. 

If pH is the probability of using a nursing home during period 
H, the net fair premium for coverage that permitted the person to 
consume X H of nursing home care would be P - p s H (X H ~ W H ). How¬ 
ever, the net expected value of the benefits to be received from pur¬ 
chasing insurance would be pn(X H - X) since Medicaid would have 
paid X in any case. (Conversely, the net value of Medicaid benefits is 

This is in contrast ttJ the case with perfect annuities available, in which bequests are 
zero. 
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plf[X - W H ].) Since the expected value with insurance is therefore less 
than the fair premium, it follows that a risk-averse person may not 
purchase insurance for this time period; an analogous argument 
holds for other time periods. Insurance would be more likely to be 
purchased if the gain in utility from moving from X to X H was never¬ 
theless high, that is, if the person highly values additional quality. 
Given some value for additional quality, insurance will be more likely 
to be purchased the larger is W H relative to X; the lower is the value of 
the Medicaid benefits, the less likely their presence will discourage 
coverage. 


IV. Utility from Bequests 

What if the model is modified so that the individual does have chil¬ 
dren or heirs and therefore the possibility of utility from bequests? 
(The person is still assumed to be the only demander of insurance or 
care.) Adding the possibility of utility from bequests does not neces¬ 
sarily change the conclusion. The simplest example of the same con¬ 
clusion would be a case in which there is zero marginal utility from 
bequests at the level of consumption that would be chosen in the 
absence of a bequest motive. Since bequests are positive in all periods 
but the last one, there can still be additions to total utility from the 
existence of such bequests (in the sense that positive bequests are 
preferred to zero bequests), but no desire to add to these bequests. 

Even if the marginal utility of bequests to heirs in the “selfish” 
equilibrium is positive but small, there may be no demand for LTC 
insurance. The cost of adding one dollar to one’s estate after a long¬ 
term illness and death is the sacrifice of $p' in current consumption in 
the “well” state. Since the marginal utility of current consumption is 
surely positive, it is quite possible that p s U’(C,) is greater than the 
marginal utility of an extra dollar of bequest at wealth level W - X 
(for a one-period illness) or at other wealth levels associated with long 
nursing home stays. 

What is true is that positive marginal utility from bequests will 
always alter the planned consumption stream out of wealth. The rea¬ 
son is straightforward; deferring a dollar’s worth of spending to the 
next period provides both enhanced consumption opportunities next 
period if one survives and an increased estate if one does not survive. 
If the second benefit becomes positive, one will be induced to choose 
lower levels of current consumption, given wealth, in any time period 
but the last. However, even at this unselfish consumption pattern, 
there may still be no demand for LTC insurance since insurance (in 
contrast to saving) does not enhance future consumption opportuni¬ 
ties in the healthy state. 
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If, in the absence of chronic illness, the desired bequest does exceed 
the actual bequest at some point over one’s expected life, this point is 
more likely to occur in the distant future than in the near future. 
Hence, a threat to bequests from chronic illness, for someone who is 
buying coverage for chronic illness that starts in the next time period, 
is likely to occur in the more distant future. But these are the time 
periods to which survival with chronic illness is unlikely. At a mini¬ 
mum, then, an optimal chronic insurance policy would carry a large 
deductible, even in the absence of loading costs, and would provide cover¬ 
age only against the very rare coincidence of events that the person (a) 
lives “too long” and ( b) has a chronic illness. With loading costs to 
selling an insurance policy, there may well be little demand for such 
insurance. 

In effect, the gain to a risk-averse person from buying coverage 
against LTC costs is less than the gain from insuring an acute care 
expense of equal amount. Hence, even at a modest loading, people 
may not be willing to buy LTC insurance. The greater the utility from 
bequests and the less sharply marginal utility from bequests declines 
with age, the greater the demand for LTC insurance. 

When will there be a strong expectation of a demand for LTC 
insurance? If term life insurance is available and if individuals choose 
to buy such term insurance, we can say that they ought surely also 
then to be willing to buy LTC insurance. If term insurance is pur¬ 
chased for the next time period, this means that 

(1 - pi + , - p' l+l )u;,(w l+ 1 ) = U'(c t ), 

where Ur is the marginal utility of bequests. But since the estate fol¬ 
lowing a long-term illness is less than W,+ j (e g., it would b %.W, + , - X 
for a one-period illness), it follows that the marginal utility of an 
additional dollar in the “costly terminal illness” state is greater than 
the marginal utility of a dollar in the sudden-death state. So purchase 
of life insurance ought to be accompanied by purchase of LTC insur¬ 
ance, and (given equal loading) LTC insurance should provide full 
coverage. 5 (It is, however, somewhat logically inconsistent to admit 


It is also possible (though rare) that a long stay in a nursing home will be followed by 
recovery to the healthy state. How will this affect the demand for insurance coverage of 
LTC costs? If insurance is of the conventional sort, paying benefits based on current 
expense levels, there may well still be no demand for insurance. Suppose that a person 
has a probability of recovery in period f of p?. The probability of being sick is, as before, 
pi The risk-averse person would prefer paying {pi ■ pf) per dollar of coverage to facing 
the risk of paying one dollar for nursing home costs and then recovering. That is, he 
would want to insure against cost in the case of the joint event of becoming sick and 
then recovering. However, the premium for nursing home insurance will be the larger 
number pi if insurance is of the conventional type, paying costs as they are incurred, 
and not making payment conditional on recovery. If the insurance pays benefits condi- 
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life insurance to the model and yet continue to assume no annuities 
since buying term life insurance is really equivalent to selling an¬ 
nuities [Yaari 1965].) 

In any event, the purchase of term life insurance by people over 65 
is also quite rare; it is estimated that only about 2 percent of elders 
currently buy such insurance (personal communication, Life Insur¬ 
ance Marketing and Research Association). It is true that death bene¬ 
fits are available from whole life insurance and that a larger fraction 
of the elderly have such policies, often fully paid up. But it is probably 
more reasonable to think of such policies as a way to accumulate 
savings rather than as a way to provide death benefits. 

Moreover, important recent work by Hurd (1987) suggests that, at 
least at the margin, consumption of the elderly appears to be unaf¬ 
fected L>y a bequest motive. Wealth follows the declining life cycle 
pattern, and the elderly with children do not leave larger bequests 
than the elderly without children. 

A more complex case concerns the elderly person with a surviving 
spouse sharing the same household. In such a case, a nursing home 
stay is likely to reduce the real lifetime consumption available out of a 
given income for the spouse if the total cost of nursing home care 
exceeds the discounted value of the future reduction in household 
expense associated with the death of the ill spouse. Impoverishing 
one’s spouse, rather than one’s children, seems to be the major fear of 
many married elderly. 

The appropriate level of LTC insurance in the case in which one 
spouse survives is much more difficult to specify, for two reasons. 

First, the death of one spouse will affect both the income and the 
consumption of the household. Income is affected because pension 
income is often lost on the death of the spouse receiving the pension. 
Consumption is affected as long as all household consumption is not 
fully joint. As Auerbach and Kotlikoff (1987) have noted, the net 
effect of the death of a spouse on the survivor's consumption oppor¬ 
tunities depends on a comparison of the income that would have been 
received by the decedent with the consumption the decedent would 
have experienced. At one extreme, if the death of the spouse does not 
affect income at all (because all provision for retirement consumption 
comes from wealth), then the death of one spouse will increase the 


tional only on the occurrence of LTC expense, much of the benefit is wasted by being 
paid in a situation from which there was no recovery. Then observations point toward 
the kind of insurance policy that would increase expected utility. The policy would be 
one that paid nursing home costs only if the person recovered. Such policies do not 
currently exist in the United States, so the failure of conventional coverage to be 
purchased is not surprising. Life insurance policies that pay some benefits before death 
for insureds in nursing homes have, however, been recently introduced. 
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consumption opportunities for the survivor. Adding LTC costs 
makes death more costly, but the net effect of a death, even one 
accompanied by LTC costs, can be to increase the consumable wealth 
expected by the survivor if the LTC costs are less than the present 
value of the future consumption had the person survived. In such a 
setting, neither life insurance nor LTC insurance need be worthwhile 
unless and until LTC costs mount so high as to reduce the survivor’s 
wealth. At the other extreme, if a sizable portion of income stops on 
the death of a spouse but if most consumption is joint, there will be a 
sizable demand for LTC insurance. 

These arguments nevertheless suggest in general a large deductible 
in any LTC policy. Suppose, for simplicity, that there are no joint 
costs (“local public goods") in the household and that consumption 
expense is divided equally. Then the deductible in an LTC policy that 
maintained the consumption opportunities of the surviving spouse 
would be equal to half of wealth. If, in contrast, the income of the 
household came from a pension that ceases on the death of one 
spouse, coverage should be greater. 

The second complexity arises because care for a chronically ill per¬ 
son can also be furnished by the spouse rather than in a nursing 
home. Provision of such care surely represents a reduction in the real 
consumption the spouse experiences. But the implicit cost of quality- 
adjusted spouse-provided care may be less than the cost of market- 
purchased nursing home care. Moral hazard may nevertheless lead to 
a substitution of the latter for the former; the ideal arrangement 
would be to make a cash payment equal to the subjective opportunity 
cost of spouse-provided services conditional on the occurrence of 
chronic illness, regardless of which type of care is actually ujed. (This 
assumes that the marginal utility of money for the healthy spouse is 
not reduced by the occurrence of illness for the partner.) However, 
such a strategy may not be feasible for an insurer. Desired bequests 
will nevertheless be more likely to exceed actual “selfish” bequests 
when a spouse is present than when no spouse is present, so one 
would expect a stronger demand for LTC insurance in such cases. 


V. Long-Term-Care Insurance and Intrafamily 
Bargaining 

There is another source of demand for chronic care insurance. As 
noted above, the major function of such insurance is to protect the 
estate the individual leaves. Even if the individual has no utility for 
bequests, the heirs presumably do. If the heirs are risk averse, one 
would expect them to purchase nursing home insurance for the el¬ 
derly individual. For instance, one way to look at such insurance is as 
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a supplement to life insurance to protect against the event that lift , 
insurance proceeds are consumed by paying unpaid nursing home 
bills; if the heirs expected to rely on life insurance, they might be 
expected to insure its payment. However, to develop this point fur¬ 
ther, we need a model of family behavior. 

The analysis thus far has viewed the individual purchaser as pur¬ 
chasing insurance with regard only to his own behavior. A more 
general approach, as suggested by Bernheim, Shleifer, and Summers 
(1985), is to imagine that parents may be able to affect the behavior of 
their heirs by manipulating future bequests, and may wish to do so. 
The reason for wanting to affect behavior comes about because indi¬ 
viduals are assumed to prefer, other things equal, that certain actions 
be performed by family members rather than provided by commer¬ 
cial firms or hired strangers. Less formally, but realistically, parents 
prefer care from their children to care from others. This motivation 
might especially be thought to characterize care for chronic illness or 
increasing frailty. Other things equal, including the subjective or ob¬ 
jective cost of care, most people would probably prefer to be cared for 
by their own family, in their own surroundings, rather than be moved 
to a nursing home or even to be attended by strangers in their own 
home. 6 While one would realize that there will be some circumstances 
in which family-provided care is infeasible, one wishes those circum¬ 
stances to be made rare. 

Bernheim et al. represent this idea by suggesting (in this context) 
parental and child (or heir) utility functions of the form 

U P = U P [Cp, A, M P , U K (C K , A)] (1) 

for the parent and 

U K = U K (C K , A) (2) 

for the child, where Cp is parents' consumption, A is “activity” or 
“attention” from children, M P is medical expenditure on parents, Uk 
is child’s utility function, and C* is child’s consumption. 

In this general model, parents manipulate bequests for heirs in 
order to affect the behavior of their children. If the elderly person 
were to remain fully able to manipulate potential bequests until death, 
he would do so in such a way as to bring forth his utility-maximizing 
level of A. In this model, A might represent family help in caring for 
a person with chronic illness. The unselfish parent would then use the 


6 Studies that showed that the parents would prefer not to burden their children with 
caring for them do not contradict this assumption since in those studies, the choice was 
always between free care from others and care by one's family with a high subjective 
cost. 
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bequest to motivate A and would potentially buy insurance against at 
least some of the nursing home costs, costs that will be incurred when 
A is either too (subjectively) costly or too ineffective. However, there 
are probable features of chronic illness and LTC insurance that may 
limit this conclusion and that call for a modification of the model. 

The elderly person probably correctly anticipates that his power to 
manage or choose both his own consumption and his bequest levels 
will be limited once illness strikes. That is, a person too feeble to 
manage any semblance of household production may also be judged 
incapable of manipulating bequests. While one could write a clause in 
one’s will to the effect that “if my children put me in a nursing home 
unnecessarily, they are disinherited,” such a clause would be impossi¬ 
ble to enforce. And in addition, there will be moral hazard associated 
with insurance coverage of formal long-term care. That is, the pres¬ 
ence of LTC coverage will encourage the children to initiate more 
formal (non-family-provided) care than would be the case without 
insurance. Without insurance, a dollar spent on nursing home care 
for a parent reduces bequests by a dollar, but if full insurance is 
available, there will be no user cost to the nursing home. A formal 
model of LTC insurance with moral hazard would be identical to 
other such models (Pauly 1968; Zeckhauser 1970), except that the 
identity of the decision maker whose demand is effective will depend 
on the “state.” 

What this means is that the elderly individual who is still capable of 
deciding on his insurance coverage but whose children will control 
the level of care should he become ill may have a higher expected 
utility with no insurance on his LTC costs than with insurance. Figure 
1 illustrates. Let Dp be the parent’s own demand for form^ care (as a 
substitute for family-provided care), D K the demand by the child for 
care for the parent, and MC the marginal cost of care, assumed to be 
equal to the price. If chronic debilitating illness of the parent does 
occur, the child’s demand curve fixes the level of M F (and, by infer¬ 
ence, the level of A) since the child then controls the decision on type 
of care to be provided to the parent. With no insurance, the child 
chooses the level of formal care Q* for the parent. Were there to be 
insurance coverage at, say /*, the parent will receive instead formal 
care in the larger amount Q*. Suppose that the elderly person would 
prefer insurance coverage /* if he could control the level of care and 
receive Q%. But since he will be forced to "overconsume” formal care 
al receive less child attention A than at {?£, and perhaps pay the 
additional premium for the insurance as well, he may well prefer no 
insurance. That is, because he prefers to consume at Qft than at (?£, 
he is willing toforgo the risk reduction benefits of insurance cover- 
age. This is so even if the parent is risk averse and would buy insur¬ 
ance could he control the levels of A and Alp. It could also happen 
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even if the parent internalized any risk aversion the child might have. 
Since the parent loses control when he becomes ill, he may prefer that 
his children at least recognize that putting him in a nursing home 
reduces their potential estate dollar for dollar. 

These arguments explain why the elderly may not even permit 
their children to buy insurance on their behalf. They also provide 
another explanation why elderly with some concern for their chil¬ 
dren’s welfare may nevertheless be unwilling to buy coverage them- 
*selves. While they would like to assure themselves that their children 
will have an adequate estate or that their children will not be subject to 
risk, they do not want to distort the incentives their children will face. 
Since we imagine that the decision to purchase LTC insurance is 
made before the elderly person becomes enfeebled, we imagine that 
strategic manipulation of bequests can be used to prevent insurance 
purchasing by one’s children. Once chronic illness has occurred, no 
insurer will sell insurance. In effect, the model is one in which an 
elderly person can choose the incentives that confront his family 
members, but not their actions. 


VI. Public Policy toward Long-Term Care in the 
United States 

From this viewpoint we can also examine in a preliminary way the 
proposals that have been made to alter LTC financing in the United 
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States. Bowen and Burke (1985), for example, proposed that a tax- 
shielded “individual medical account” should be created. In one ver¬ 
sion, part of the funds deposited in this account would be available 
(with interest) for one’s own nursing home costs, and part would be 
pooled with the contributions of others in a kind of LTC insurance. It 
is modeled after the individual retirement account, but with funds 
earmarked for long-term medical care. 

The fundamental question is whether a private decision to avoid 
LTC insurance because the value of the dollars in the “sick state” is 
low, or because of intrafamily moral hazard, ought to be overridden 
by a tax subsidy. If the market for annuities remains imperfect, such a 
subsidy does not necessarily improve welfare. Of course, if the tax 
subsidy (or any other intervention) could reduce the administrative 
cost of insurance, it could be worthwhile. Indeed, improving the mar¬ 
ket for annuities might be the most important first step in encourag¬ 
ing a market for LTC insurance. But in the current situation, sub¬ 
sidies to the nonpoor may well not be justified. In a similar vein, 
improving the ability to define and measure the circumstances that 
can trigger nursing home benefits, and thus avoid intrafamily moral 
hazard, is likely to be more efficient than subsidizing current insur¬ 
ance products. 

These conclusions must be tentative because, as in virtually any 
other second-best situation, unequivocal theoretical conclusions are 
difficult to obtain. If LTC insurance can be structured as an annuity 
substitute, provision of such coverage can improve welfare, in part by 
reducing the need for unintended bequests. This rationale would 
apply more strongly to the (relatively unlikely) “recovery from 
chronic illness” situation. A subsidized or tax-funded pay-as-you-go 
LTC insurance might, in such a case, have depressing effects on sav¬ 
ings just as pay-as-you-go social security. The main benefit of more 
extensive LTC insurance, public or private, would be the benefit ob¬ 
tained by risk-averse heirs from reducing the risk attached to the 
inheritance they will receive. This gain has not been identified in the 
LTC insurance debate as a matter of serious public concern. There is 
a case for subsidizing coverage that reduces the likelihood of Medi¬ 
caid spending, which I have discussed elsewhere (Pauly 1989). But for 
the nonpoor elderly this paper discusses, who are exactly the elderly 
whose behavior would be most affected by tax subsidies, the Medicaid 
savings are probably small. 

The demand for LTC insurance will be greatest among those who 
already purchase (term) life insurance. The nonelderly (who are nev¬ 
ertheless at some risk for nursing home care), whose costly nursing 
home stay before death would deprive a surviving spouse of signifi¬ 
cant income, would seem to be the major candidates for coverage. 



LONG-TERM-CARE INSURANCE 


167 

Widows and widowers, even those who can “afford” coverage (in the 
sense of having income sufficient to cover premiums), will probably 
remain reluctant to purchase. 


VII. Conclusion 

The models in this paper help to explain why a rational risk-averse 
person who is not poor might, nevertheless, choose not to buy con¬ 
ventional insurance against nursing home care costs. Such coverage 
serves primarily to protect bequests that, with imperfect annuities, are 
likely to be excessive in any case. And coverage makes it too easy for 
children to substitute formal care provided by others for the informal 
care rendered by the children. 

There may therefore be good reasons why people, especially non¬ 
poor people, do not buy LTC insurance. The mere absence of cover¬ 
age does not necessarily imply the existence of a problem of market 
failure requiring government intervention. 
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This paper analyzes theoretically and empirically the role and sig¬ 
nificance of occupational mobility in the labor market focusing on 
individuals' careers. It provides additional dimensions to the analysis 
of investment in human capital, wage differences across individuals, 
and the relationships among promotions, quits, and imerfirm occu¬ 
pational mobility. It is shown that part of the returns to education is 
in the form of higher probabilities of occupational upgrading, within 
or across firms. Given an origin occupation, schooling increases the 
likelihood of occupational upgrading. Furthermore, workers who 
are not promoted despite a high probability of promotion are more 
likely to quit. 


I. Introduction 

Occupational mobility is an outstanding characteristic of the Ameri¬ 
can labor market; very few workers perform the same task through¬ 
out their working lives . 1 


We are deeply indebted to Joe Altonji. Jacob Mincer, Chris Paxson, Sherwin Rosen, 
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The economic literature pertaining to the role of occupations in the 
labor market has, for the most part, focused on occupational choice. 
Studies of occupational mobility have been conducted within a job¬ 
matching framework in which occupational mobility is assumed to be 
the outcome of changes in the information set, market conditions, or 
workers’ characteristics (e.g., Miller 1984). With the notable exception 
of Rosen (1972), the fact that job mobility is an integral part of work¬ 
ers’ careers, however, has been virtually neglected. 

This paper analyzes theoretically and empirically the role as well as 
the significance of the phenomenon of occupational mobility in the 
labor market focusing on individuals’ careers. 3 The study provides an 
additional dimension to the existing analysis of prominent labor mar¬ 
ket phenomena including investment in human capital, wage profile 
differences across individuals, and the relationships among promo¬ 
tions, quits, and interfirm occupational mobility. 

An econometric model of career mobility is presented and several 
implications of the theory are tested. The relationships among occu¬ 
pational mobility, returns to schooling, and firm separation are ana¬ 
lyzed. The effects of different characteristics on the probability of 
career mobility are estimated, and the differences are examined be¬ 
tween workers who move along their career path within the firm and 
those who do so by moving across firms. 

The introduction of the concept of occupations and occupational 
mobility into the study of investment in human capital and labor 
mobility explicitly captures heterogeneity in human capital. Namely, 
skills are to a large extent occupation specific, and their transferability 
across jobs is limited. 4 Constraints are therefore added to the process 
of investment in human capital and to the movement across several 
activities over the life cycle. The implications of these constraints on 
the optimal time path of investment in human capital and on workers' 
mobility across firms are examined. 

It is shown that, as the theory predicts, part of the return to educa¬ 
tion is in the form of a higher probability of occupational upgrading. 


* In Rosen's model, jobs/occupations differ by the amount of on-the-job training they 
provide. The realization of an optimal path of investment in human capital, over the 
life cycle, might involve jobs/occupational mobility. 

3 Spilerman (1977) defines "career line" or “job trajectory" as "a work history that is 
common to a portion of the labor force" (p. 551). Following Slocum (1974), he uses the 
terra “career” io refer to an individual's job history and the terms "career line" and “job 
trajectory” “to denote an empirical regularity in the labor force.” Sommers and Eck 
(1977) use the term “career ladder," which they define as “a series of occupations 
forming a path of advancement, usually through gaining skills and experience, to a 
higher status occupation" (p. 5). 

' Weiss (1971) analyzes the implications of occupation-specific skills on investment in 
human capital. 
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within or across firms. Given an occupation of origin, schoolingin- 
creases the likelihood of occupational upgrading. Furthermore, 
workers who are not promoted despite a high expected probability of 
promotion are more likely to quit the firm. 


II. A Theory of Career Mobility 

This section constructs a theoretical model of optimal career choice, 
firm separation, and occupational mobility. The model is character¬ 
ized by a variety of occupations that are available for individuals 
within as well as across firms. Occupations are related to each other by 
the transferability of skills. In the presence of differences in ability 
(and therefore schooling) across individuals, the sequence of occupa¬ 
tions that forms the individuals’ optimal career path may differ. 

Individuals’ optimal career paths may involve intrafirm mobility as 
well as interfirm mobility. Intrafirm career mobility (“promotion") is 
subject to the employer’s decision, whereas interfirm mobility and its 
optimal timing are determined by the individuals who choose the 
optimal quitting time so as to maximize their expected lifetime earn¬ 
ings. Intrafirm career mobility is uncertain. The probability of pro¬ 
motion is a function of schooling, ability, and job experience. The 
optimal investment in human capital as well as the optimal quitting 
time maximize the individual's expected lifetime income. 

Since the focus of the discussion is the transferability of skills across 
occupations, we ignore the effect of on-the-job training on the wage 
in the occupation, considering only the effect of accumulated human 
capital on the probability of promotion and the wages in succeeding 
occupations. Thus it is assumed that wages are constant while one 
works in the same occupation and wage growth occurs solely through 
occupational mobility. Wages, however, vary across individuals be¬ 
cause of differences in ability, education, and experience accumu¬ 
lated in previous occupations. 


A. The Model 

Consider an economy in which individuals wish to allocate their finite 
lifetime, T, between education and various feasible occupations so as 
to maximize their expected lifetime income, E(Y): 

E(Y) = [ T e-^Eiw,) dt, (1) 

Jo 

where r is the rate of interest on borrowing and lending in the existing 
perfect capital market. The rate of interest is constant over time. 
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1. Education 

Education provides individuals with human capital, which subse¬ 
quently raises their future earnings through two channels, directly, 
via the potential returns to schooling in certain occupations, and indi¬ 
rectly, through the improvement in their career path. The costs of 
education are solely the forgone earnings. 

Individuals who attend the education system for a period of t, years 
acquire a level of human capital H s . 5 The term H , = H t (t s ; a), where H s 
is an increasing function of individuals’ natural ability, a, as well as of 
years of schooling. 

2. Occupations 

The economy is characterized by a variety of occupations that differ 
in the required levels and types of human capital. There are n types of 
firms in the economy,;' = 1, 2, 3, . . . , n, several of each type. Each 
type of firm offers a series of two occupations. 

Occupation 1 in a type ; firm may be joined at any point in time by 
individuals whose ability level exceeds a'. The ability requirement 
increases with the firm type, that is, a J+l > a 1 . The wage rate in 
occupation 1 in a type j firm, wj, is a function of the individual’s level 
of education, l s , as well as the individual’s ability: wj = wjfa; a). Given 
education and ability, the wage in occupation 1 in a type ; firm is 
higher than that offered in occupation 1 in a type j — 1 firm. 

Occupation 2 in a type; firm can be obtained either through pro¬ 
motion from occupation 1 in the same firm (i.e., intrafirm mobility) or 
via mobility from occupation 2 in a type ; - 1 firm (i.e., interfirm 
mobility). A promotion decision is made after an individual has spent 
a constant time interval, a > 0, in occupation 1 in a particular firm of 
type j. The decision is final in this particular firm. 6 However, individ¬ 
uals may try once again in another firm of type j. Those individuals 
will be considered for promotion after an additional period of length 
3 Si 0. 

I he probability of promotion from occupation I to 2 in every firm 
of type;, P 1 , is positively related to the level of human capital obtained 
in school, H s , that obtained through experience in occupation 1 in 
type j firms, //,, and ability, a. Thus P> = pJ(t„ H,; a). In turn, H , is 


Clearly, the direct cost of education could be incorporated into the analysis, and an 
alternative assumption in which it is not possible to generate education without the 
completion of school (no partial credit) could have been employed. These alterations 
will have no effect on the qualitative results. 

This simplifying assumption captures the notion that the likelihood of promotion 
declines beyond a certain point in time. 
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Wage rate 



an increasing function of education, ability, and time spent in occupa¬ 
tion 1 of type j firms. 

The wage rate in occupation 2 in a type j firm, w 2 , is an increasing 
function of the individual’s education, H s , human capital obtained in 
occupation 2 in a type j - 1 firm, Hlf ', and ability. Individuals who 
quit occupation 2 in a typey — 1 firm and join occupation 2 in a type 7 
firm are rewarded for job experience acquired in occupation 2 in a 
type j - 1 firm. However, regardless of the acquired level of experi¬ 
ence, given ability and schooling, their wages are lower than those of 
individuals who were promoted within the firm: wl 2 = w 2 (H 2 ~', fi; a), 
where H 2 ~ 1 is an increasing function of time spent in occupation 2 in 
firm j — 1 , education, and ability. 

Given ( t s , a), irrespective of the level of H 2 \ the wage rate in a type 
j firm is higher in occupation 2 than in occupation 1 , and the wage 
in occupation 2 in a type j firm is higher than that offered in occupa¬ 
tion 2 in a type j - 1 firm Moreover, to truncate the number of fea¬ 
sible career paths further, it is assumed that for any H 2 ~\ given 
(<„ a),Wj + ' < wi 2 < tc( +2 . The wage structure is depicted in figure 1 
for a given value of the vector ( H 2 ~ \ fi, a). 

3. Feasible Career Paths 

A career path is a series of occupations, characterized by the trans¬ 
ferability of skills and experience from one to another, that form a 
feasible working career. Consider individuals whose career paths are 
limited, for simplicity, to three jobs. (A job transition is defined as a 
change of occupation or a firm.) Individuals with ability level a J may 
start their career in any firm of a type h, h £ j. Suppose that a firm of 
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entry is chosen optimally. Given t, years of schooling, individuals who 
start their working lives in a firm of type h earn the wage rate w h x until 
the promotion decision date. 7 Given the outcome of the promotion 
decision, several feasible careers may be optimal. 

1. If promotion is approved, individuals move to occupation 2 in 
the firm, obtaining a wage rate u>i They remain in this occupation 
until the optimal quitting time to move to occupation 2 in firm h + 1. 
If quitting is indeed optimal, they join this occupation and remain 

there until the end of their working career, earning the wage rate 

*+ i 
wa • 

2. If promotion is not approved, individuals may find either of the 
following paths optimal: (a) remain in occupation 1 in the firm of 
origin or (b) quit in favor of another firm of the same type, obtaining 
the same wage, u»*, for an additional 3 years until a promotion deci¬ 
sion will be made. 8 Then with probability P h (which is an increasing 
function of the length of time spent in occupation 1, a + 3), promo¬ 
tion will be approved, and they will move to occupation 2 and will be 
paid the wage rate zo 2 - Otherwise, they will remain in occupation 1. 

If promotion is approved, the optimal quitting time is a function of 
the costs associated with quitting as well as the contribution of the 
current job experience to the wage in occupation 2 in firm h + 1, 
W 2 + ’■ If promotion is not approved, the decision whether and when to 
quit (in order to try to obtain a promotion in a different firm of the 
same type) is determined by the costs associated with quitting as well 
as the probability of promotion in a type h firm. 9 

B. The Individuals’ Optimization 

Individuals wish to choose the level of schooling and a feasible career 
path so as to maximize the present value of their expected lifetime 
earnings. The optimal values are derived using the method of a back¬ 
ward solution. 

The current version ot the paper excludes for the sake of brevity the possibility of 
quitting prior to the promotion decision. If the wage in occupation 1 would have been 
an increasing function of experience in other occupations, the possibility of quitting 
prior to the promotion decision would have existed, adding to the richness of the 
Galor and Sicherman (1988) for a discussion of this possibility. 

The necessary tenure in the new firm for a promotion, fi, may be longer or shorter 
than that in the original firm, a > 0. Furthermore, B may be equal to zero, in which case 
promotion in tfie new firm may be considered without actual attendance. 

If promotion is not approved, the possibility that quitting to a lower-level firm will 
be beneficial is excluded under the assumption that the cost of quitting to a lower-level 
firm is sufficiently large. Quitting to move to occupation I in a higher-level firm (when 
promotion is not approved) is excluded as well. Mobility cost to a different type of firm 
is assumed to be significgjjdy larger than that to a different firm of the same type. 
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1. The Optimal Career Path ' 


Consider individuals who spend their initial t s years in the education 
system and then join occupation 1 in a type j firm. 

Promotion is approved .—Individuals who are promoted at the pro- 
modon decision date t s + a, who find it beneficial to quit, wish to 
choose the optimal quitting time from firm j, y/, so as to maximize the 
value (at t = 0) of their future earnings, Vp{y J p \ t s , a). 

Let Cp be the present value (evaluated at time 0) of the costs associ¬ 
ated with quitting occupation 2 in firm j to move to occupation 2 in 
firm j + 1; is independent of the quitting time. The individual’s 
maximization problem is therefore 


Clr 

max V p (y J f,\ t„ a) = e r ‘w’ 2 {t % , a) dt 
yp -h,+ a 


( 2 ) 

rT 

+ e r ‘ui!/ (y/; t it a) dt - C p 

hi 

subject to t s + a s y J p < T. The term V p (y{,\ t s , a) is assumed to be twice 
continuously differentiable and strictly concave in y P . Thus for an 
internal optimum—that is, t, + a < (y/)* < T —a necessary and 
sufficient condition for the maximization problem is 





d(w£ )* 

dy r 


(3) 


where (w^ 1 )* = w 2 + 1 [(y/,)*, t„ a]. 

The optimal quitting time is characterized, therefore, by the equali¬ 
zation of the present value of the direct loss of income resulting from 
the delay in joining occupation 2 in a type j 4- 1 firm (i.e., the present 
value of the difference between the wages in occupation 2 in type j 
and j + 1 firms) and the present value of the additional stream of 
income in occupation 2 in a type j + 1 firm because of the lengthen¬ 
ing of the experience in occupation 2 in a type; firm. 

If it is assumed that individuals who are indifferent between quit¬ 
ting and not quitting remain in the firm, quitting will take place if and 
only if 


V p ((y{,)*, t s , a) > [ e~ rt w 1 2 (t„ a) dt. (4) 

Promotion is denied .—Individuals who are not promoted at the pro¬ 
motion decision date, t s + a, may either quit to go to another firm of 
the same type, where an additional promotion attempt will be made 
after a period of length P, or remain in occupation 1 in the same firm. 
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The present value (evaluated at time t = 0) of the future earnings 
of individuals who do not quit the firm, V } „ q (t„ a), is 


V’ iq (t„ a) = j T e a) dt, (5) 

whereas the present value of the future earnings of individuals who 
quit the firm, V ] q (t„ a), is 


V’(t„ a) 


e r ‘wl(t i , a) dt + P’(t„ a + p, a) 


( 6 ) 


x 


e a) - w{(t„a)\dt - C„ p , 

/, + a + 0 


where C nf is the present value (evaluated at time 0) of the cost associ¬ 
ated with quitting occupation 1 in a type j firm to go to another firm of 
typej. Thus if promotion is not approved, an individual will quit the 
firm if and only if V q (t s , a) > V,{ q (t„ a). 

Corollary 1. The higher is the probability of promotion, the 
greater is the possibility of quitting if promotion is not approved. 

The corollary follows immediately from equations (5) and ( 6). 10 


2. Optimal Schooling 

Individuals determine the optimal level of schooling, I,, so as to 
maximize the present value of their expected lifetime earnings, 
M(t,\ ( 7 /)*, a): 

J 'h + a 

e~ r, W] dt 
1 , 

+ {P J {t,\ a, a) V p \t,\ ( 7 ^)*, a] (7) 
+ [1 - P^(I,; a, a)]S4a: a)}, 

where S£p(t x ; a) = max[V^ / ,(< J ; a), V q (t s ; a)] is the present value of the 
future earnings of an individual who was not promoted. 

The term M(t,; ( 7 /,)*, a), which is twice continuously differentiable 
in t s , is assumed to be strictly concave in t s . Consider an internal solu¬ 
tion for the optimal schooling level. A marginal increase in the op¬ 
timal schooling time results in the equalization of the gains and losses 


10 Corollary 1 is based on the assumption that a denial of promotion is firm specific 
(e.g,, a bad match or limited vacancies), thus not affecting the likelihood of promotion 
in other firms. If suyh a denial indicates a lower likelihood of promotion in other firms, 
the likelihood of quitting will also be lower. This corollary is discussed and tested 
empirically in Sec. III. 



CAREER MOBILITY 


>77 

in the present values of expected future earnings^ The gains are due< 
to the improvement in the probability of promotion and the increase 
in the wage rate in occupation 2 in a type j + 1 firm, whereas the 
losses are due to the delay in the beginning of the working career in 
the various occupations. 


3. The Optimal Level of Entry 

Consider the discussion in Section IIA 3. Individuals who are charac¬ 
terized by an ability level a' may start their working careers in occupa¬ 
tion 1 of any firm of type h as long as h s j. As was postulated earlier, 
given the education level, the wage rate in occupation 1 is higher in a 
higher-level firm. Nevertheless, individuals may consider a lower- 
level firm in which the direct return to schooling is lower if in those 
firms, for a given level of schooling, the probability of promotion is 
higher and, subsequently, so is the probability of obtaining higher 
future wages. 

An initial entry at a firm of level j — 3 or lower is nonbeneficial. 
The highest possible wage rate that may result from this type of 
career is lower than the initial (and thus certain) wage rate obtained in 
occupation 1 in firm/ Initial entry at a firm of level/j - 1, or; - 2, 
however, may clearly be beneficial. The optimal level of entry, there¬ 
fore, is the one under which the value function M h *, h = j,j - 1, or 
j - 2, as defined by (7), attains its maximum over h. 

Corollary 2. Individuals may choose an entry level in which the 
direct returns to schooling are lower than those in other feasible entry 
levels if .the effect of schooling on the probability of promotion is 
higher in this entry level. 


C. Empirical Implications 

The theory of career mobility suggests several specific predictions 
concerning the effects of schooling on wages and firm mobility. While 
in some occupations the returns to schooling are in the form of higher 
wages, in other occupations the returns are in terms of higher proba¬ 
bilities of advancing to occupations with higher wages. This hypothe¬ 
sis can explain the observed differences in returns to schooling across 
occupations. The model suggests that if the returns to schooling are 
lower while one works in a specific occupation, the effect of schooling 
on the probability of being promoted from this occupation (within or 
across firms) will be higher. Similarly, it will be rational for some 
individuals to spend a portion of their working careers in occupations 
that require a lower level of schooling than they have acquired. This 



JOURNAL OF POLITICAL ECONOMY 

observation can serve as a partial explanation for the phenomenon of 
“overeducation” and is discussed in Sicherman (1987). 

The theoretical model provides an ambiguous prediction concern¬ 
ing the unconditional effect of schooling on career mobility. On the 
one hand, highly educated individuals are able to start their working 
careers in a higher-level occupation (higher step on the ladder). Their 
careers, therefore, might involve fewer occupations. On the other 
hand, highly educated individuals face greater opportunities (longer 
ladders). The model suggests, therefore, that given an occupation of 
origin, more educated individuals are more likely to move to a higher- 
level occupation. 

At any point in time, individuals face different probabilities of pro¬ 
motion within the firm, based on personal characteristics and occupa¬ 
tion. The model predicts that among individuals who were not pro¬ 
moted, those with a higher probability of promotion are more likely 
to quit the firm. 

Specific human capital and job-matching theories predict a negative 
effect of tenure on mobility. The presented theory of career mobility, 
conversely, predicts that there exists a positive effect of tenure on 
occupational mobility; individuals acquire skills and experience in one 
occupation in order to be able to move to another occupation. One 
way to test for the positive duration effect on career mobility is to look 
only at intrafirm occupation mobility since this type of mobility entails 
no loss of firm-specific investment. 

Unobserved heterogeneity of workers is likely to play an important 
role in occupational choice and occupational mobility decisions. Such 
heterogeneity might give rise to occupational mobility because of a 
matching process discussed earlier. Nevertheless, if a worker samples 
an occupation and as a result of a bad match makes a transition, there 
is no reason to assume that he or she will move to a higher-ranked 
occupation. Bamberger (1986), for instance, shows that when the 
matching takes place while in school, it may be the case that an op¬ 
timal sampling will involve a move from a higher- to a lower-level 
occupation. Therefore, the upward transitions analyzed in this paper 
are more likely to be the result of career mobility, although it might be 
that some of those transitions are due to occupational matching. An¬ 
other implication of unobserved heterogeneity is that firms might use 
promotions as a screening process. Workers who are not promoted 
are those whose specific quality is revealed not to be high enough. 
Nevertheless, the results reported in this paper with regard to 
interfirm mobility cannot be fully explained as a screening process. 

Occupational matching and screening theories are certainly possi¬ 
ble reasons for occupational mobility. Some of the results reported in 
this paper could very well support some implications of these theo- 
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ries. As a whole, we find that the empirical findings support the 
theory of career mobility and its implications. We do not attempt to 
provide a discriminating test between these theories. 

In the next section these empirical predictions are tested. An 
econometric model of career mobility is constructed and estimated 
using a large panel data set. 

III. An Econometric Model of Career Mobility 

A. The Data and Definitions 

A sample of male heads of households, aged 18—60, observed annu¬ 
ally over the period 1976-81 was drawn from the Panel Study of 
Income Dynamics (PSID) individuals tape. 11 Individuals reported 
their occupation at the time of the survey or, if unemployed, the last 
occupation held. Occupational change is defined to occur when the 
two-digit occupational category reported by the worker in two succes¬ 
sive surveys is different. Because of measurement errors, the mea¬ 
sured rate of transitions is expected to be higher than the real rate. 12 

The implicit assumption is that, with those categories, an occupa¬ 
tional change will be observed when there is an apparent change in 
the tasks performed by the worker. Since each category is a combina¬ 
tion of a number of detailed occupational titles, it is possible that some 
individuals move between relatively different occupations in the same 
category with no change observed, while others move between rela¬ 
tively similar occupations that fall into different categories and a 
change will be observed. We assume that, on average, workers who 
move across categories experience a bigger change in tasks than those 
who move across occupations within a category (see App. A for a list 
of the 25 categories). 

Occupational mobility that is due to career mobility is considered as 
mobility to a higher-level occupation. We use this criterion in order to 


11 The data include a “poverty subsample," but the qualitative results are not affected 
by its exclusion. A data appendix is available from the authors on request. 

12 The extreme assumption that the reported occupation is a pure noise was strongly 
rejected by comparing the observed transitions per individual with that produced by a 
binomial process. An indication of the amount of measurement error can be obtained 
by looking at the number of cases in which individuals report a transition to an occupa¬ 
tion held 2 years earlier (although each error in reporting occupation will cause two 
spurious transitions, only one will be captured in the career mobility models). Fifteen 
percent of the transitions in the PSID are of such a nature. Nevertheless, it should be 
clear that such transitions are not necessarily erroneous. While it is expected that part 
of the upward occupational mobility will take place within the firm (around half of the 
transitions are such), it is unlikely that one would observe an occupational change 
without a change in position. On the basis of recoded tenure in position, half of the 
occupational transitions took place without a change in position. We believe that this 
contradiction is mainly due to reported errors in tenure in position. 
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distinguish career mobility from other types of occupational mobility, 
although it is possible that mobility to a lower-level occupation (based 
on our ranking) will be part of the worker’s career mobility (see App. 
B for details). 

The vertical distance between occupations is measured as the dif¬ 
ference in the mean levels of human capital needed to work in the 
occupations, after required training is completed. These levels are 
constructed by summing the weighted means of the levels of school¬ 
ing, (a proxy for) market experience prior to entry to the occupation, 
and the amounts of training required in order for a worker to be 
qualified to work in the different occupations. The weights are the 
estimated coefficients of these variables in a wage regression. For a 
formal derivation and discussion, see Appendix B. 


B. The Model 


In this subsection, an econometric model of career mobility is pre¬ 
sented and the effects of different characteristics on the probability of 
mobility are estimated. The distinction between inter- and intrafirm 
mobility made in the paper enhances the understanding of the in¬ 
teraction between firm and occupational mobility as elements of ca¬ 
reer development. 

At each period an individual will experience one of the three fol¬ 
lowing alternatives described by move to a higher-level occupation 
across firms (; = 1), get promoted to a higher-level occupation within 
the firm (/ = 2), or neither {j = 0). 

Transition j occurs when the latent variable Y* mtJ > 0, where 

^ iml) X„a, ~*“ ”V; ED, + + €, mtj P; (®) 


where t is the individual index, m is the occupational index, t is time 
(the initial period), j is the alternative, X„ is a vector of individual 
characteristics that may vary across time, and ED, is the level of school¬ 
ing. Parameter 8 m is an occupation fixed effect. It is assumed to be 
constant across time and across individuals. 

Assuming that t is logistically distributed gives rise to a multinomial 
logit model in which the underlying probabilities are 


Pj 


exp(Zp 7 ) 

2* = 0 exp(Zp*)’ 


J = 0, 1, 2. 


(9) 


In order to identify the parameters, the normalization (3 0 = 0 is 
imposed and the estimated parameters are obtained by maximum 
likelihood. Table l reports the estimation results for upward transi¬ 
tions within and across firms. 
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If one aggregates the two types of upward transition into one cate¬ 
gory (“career mobility”), the model collapses to a standard logit 
model. The estimation results under this specification are also re¬ 
ported in table 1 (cols. 9-12). 

As discussed earlier, unobserved heterogeneity may be a determi¬ 
nant of mobility. To the extent that such unobservables are correlated 
with observables (e.g., schooling and tenure), our estimates are likely 
to be biased. 15 

Another estimation problem might arise because of misreporting 
of occupational changes (see n. 12). However, it can be shown that if 
the errors in reported occupations are random, the estimated coeffi¬ 
cients will be biased toward zero, thus weakening the reported results. 
Focusing on upward transitions will only reduce the amount of errors 
without causing any additional bias. 


I. Schooling and Career Mobility 

The theory of career mobility predicts two opposite effects of school¬ 
ing on career mobility. Since more educated workers can start their 
working careers in a higher-level occupation, their careers might in¬ 
volve a smaller number of distinct occupations than those of less 
educated workers. In addition, high-skill careers might involve fewer 
changes in tasks over time, which will cause more educated workers to 
have fewer transitions. On the other hand, as predicted by the model, 
given the occupation of origin, more educated workers are more 
likely to move to a higher-level occupation (within or across firms). 

Without a control for occupation of origin (table 1, col. 11), school¬ 
ing has a negative effect on career mobility. This result indicates that 
careers of more educated workers are more likely to be composed of a 
smaller number of distinct occupations. If we control for one-digit 
occupation of origin, schooling has a positive effect on career mobility 
within and across firms. 14 Given firm separation, more educated 
workers are more likely to quit than to be laid off, but schooling 
increases the likelihood of upward mobility in the case of both quits 
and layoffs (not shown here). 

The schooling effect on the probability of career mobility will vary, 
depending on the type of career and the occupation the worker is in. 


15 In the maximum likelihood estimates, in the presence of individuals’ fixed effects, 
even with no correlation between the fixed effects and other covariates, the estimation 
results of these covariates are likely to be inconsistent. 

1A A similar observation is made with regard to black workers. On average, they are 
more likely to move to a higher-level occupation. If we control for occupation of origin, 
the race dummy becomes negative. See Galor and Sicherman (1988) for a discussion on 
race and other variables in the career mobility models. 
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In the next subsection, we analyze the differences in the returns to 
schooling across occupations. 

2. The Effect of Schooling on Wage and on the 
Probability of Promotion 

As suggested by the theoretical analysis, at some stages of a working 
career we might observe that workers with different levels of human 
capital have the same wages within a specific occupation. In other 
words, the estimated short-run returns to schooling, when workers 
are observed while at that stage, will be relatively low. 

Human capital theory is a life cycle theory, and returns to schooling 
should be estimated accordingly. Therefore, we suggest that the ob¬ 
served differences in returns to schooling across occupations may 
possibly be due to the differences in promotion probabilities across 
occupations. 

In the following, we test the hypothesis that if the return to human 
capital (schooling) is lower while one is working in a specific occupation, the 
effect of schooling on the probability of being promoted from that occupation 
will be higher. Consider the following fixed-effect models: 

YL = X„0, + y„ED, + b n + e, m „ (10) 

ln(W lW ) = X 1 ( p 2 + a m ED, + g. m + e,' m ,. ( 10 ') 

Equation (10) is a career mobility equation in which the schooling 
effect (y„) is occupation specific. Equation (10') is a standard wage 
regression. As in equation (10), the schooling effect (a„,) is occupation 
specify. 

The following equation is implied by our hypothesis and will be 
tested empirically: 

corr(a OT , y m ) < 0. (11) 

Estimates of a m and b m are presented in table 2. 

The estimated correlation between the effect of schooling on wage 
in the occupation and its effect on the probability of moving to a 
higher-level occupation is - .61 and is highly significant. (The proba¬ 
bility that the correlation is different from zero is .9985.) 

Since each of the coefficients is measured with a different level of 
error, it can be shown that the measured correlation given above is 
underestimated. This result is based on the assumption that the estima¬ 
tion errors are independent. Since the two sets of returns are derived 
from the same sample, this assumption might not hold. In order to 
ensure such an independence, we divided the data into two random 
subsamples and reestimated the regressions using a different subsam¬ 
ple for each regression. The estimated correlation between the two 
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Schooling Effec t on Career Mobility and Wage: Interaction between 
Schooling and Occupational Dummies in the Career Mobility (Logit) 
and Wage Regressions 



Career Mobility 
Model* 

Wage 

Model 

(3) 

Occupational Category 

Coefficient 

(1) 

Probability 

(2) 

10 Physicians, dentists’ 

11 Other medical and paramedical 

.05784 

.0073 

.0922 

(7.88) 

.0594 

12 Accountants and auditors 

(.94) 

-.06144 

- .0078 

(2.28) 

.0780 

13 Teachers, primary and secondary schools 

(-82) 

.02647 

.0033 

(3.44) 

- .0028 

14 Teachers (college), social scientists. 

(.45) 

-.06757 

- .0086 

(14) 

.0686 

librarians, and archivists 

(.84) 


(2.58) 

15 Architects, chemists, engineers, and 

-.14642 

-.0186 

.0755 

physical and biological scientists 

(1.73) 


(7.90) 

16 Technicians 

1175 

.0149 

.0501 

17 Public advisors 

(1.84) 

.05762 

.0073 

(6.33) 

.0605 

18 Judges, lawyers 

(93) 

- .33584 

- .0426 

(5.21) 

.3487 

19 Professional, technical, and kindred 

(.98) 

.1564 

.0198 

(3.24) 

.0237 

workers not listed above 

(2.69) 


(1.20) 

20 Managers, officials, and proprietors 

.3885 

.0493 

.0739 

(except farm), not self-employed 

(5.15) 


(19.6) 

31 Like 20, self-employed (unincorporated 

.2153 

0273 

.0681 

businesses) 

(3,26) 


(6.77) 

40 Secretaries, stenographers, and typists 

.1138 

.0144 

-.0627 

41 Other clerical workers 

(2.19) 

.1426 

.0181 

(1-40) 

.0308 

45 Sales workers 

(3.48) 

.07513 

.0095 

(5.09) 

.1064 

50 Foremen not elsewhere classified 

(1.98) 

.2164 

.0274 

(12.5) 

- .0372 

51 Other craftsmen and kindred workers 

(6.08) 

.1953 

0248 

(4.29) 

.0371 

52 Government protective service workers 

(5.85) 

.1176 

.0149 

(127) 

.0429 

(tire, police, marshals, and constables) 

(2.71) 


(3.10) 

55 Members of the armed forces 

.06732 

0085 

.0830 

61 Transport equipment operatives 

(.43) 

.05677 

.0072 

(6.06) 

.0336 

62 Operatives, except transport 

(2.32) 

.1198 

.0152 

(7.21) 

.0437 

70 Unskilled laborers (nonfarm) 

(5.09) 

.1101 

.0140 

(13.3) 

.0382 

71 Farm laborers and foremen 

(5.18) 

08899 

.0113 

(9.30) 

.0446 

73 Other service workers 

(3.12) 

.04436 

.0056 

(4.90) 

.0311 

80 Fanners (owners and tenants) and managers 

(2.17) 

.06254 

.0079 

(6.12) 

.0666 


(.30) 


(2.79) 


Note,—-A bsolute /-statisticsai* in parentheses 

* The lugit parameter ntinuiei are in col. I, and the derivatives for the probabilities are reported in col 2, 
calculated as [/(l - p)j. 

* Qbservauons in which the highest-level occupauon is observed are excluded The other independent variables 
are market experience, firm tenure, union membership, rate, SMSA, if married, if disabled, and occupation of 
origin 
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sets of returns this time was — .53 and was again significantly different 
from zero. The reduction in the correlation is tlhe result of avoiding 1 
the positive correlation between the regressions’ estimated errors and 
the increase in the standard errors of the estimated coefficients due to 
the smaller number of observations. 

3. Quitting and Career Mobility 

Economic theory suggests that a worker will quit his job if the ex¬ 
pected present value of his future earnings if he stays in the firm is 
lower than if he leaves the firm. Most of the work that we are aware of 
relates quitting decisions to changes in the economy or imperfect 
information. Quitting as a result of a bad match and finding out the 
existence of a better job in another firm are examples of imperfect 
information (or the arrival of new information) concerning opportu¬ 
nities and the nature of the firm. Changes in the economy might 
make the worker reevaluate his position and cause him to quit. 

The theory of career mobility presented here suggests an addi¬ 
tional reason for mobility (in the spirit of Rosen [1972]): Quitting is a 
device by which workers realize an optimal path of a chosen career. 
When a career that a worker considers his best choice cannot be 
realized in one firm (and the loss of firm-specific human capital is 
taken into account), quitting will be part of the worker’s optimal (ex 
ante) career path. What is unique to this type of quitting is that it may 
be planned in advance by the worker. 

Some stages of the career are uncertain. We presented this uncer¬ 
tainty as the probability of being promoted inside the organization. 
The actual quitting time, conditioned on the promotion decision, will 
differ from the initial expected quitting time. The theoretical result 
that this section tests empirically is the effect of the promotion deci¬ 
sion on the worker’s decision to quit. Our hypothesis is that the higher 
the expected probability of promotion a worker has, the larger the 
effect of not being promoted on the decision to quit. 

There are many reasons why workers who have high expected 
probabilities of promotion are not promoted. One reason is that there 
are not enough vacancies for the higher position (or the worker has 
reached the highest position available in the firm). Another reason 
might be unobserved heterogeneity or a bad match. If the reason for 
the bad match is firm specific, the worker might decide to quit. 

On the other hand, a realization that the reason for not being 
promoted will also hold in other firms might induce the worker to stay 
(if he is not laid off before) in the same occupation/position within the 
firm. 

We call those workers who have a high expected probability of 
being promoted but are not promoted the “disappointed workers.” 
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The higher the level of expected probability of promotion, given no 
promotion, the higher the level of disappointment. 

We predict that the higher the level of disappointment, the more 
likely the worker is to quit. We next test this hypothesis empirically. 

For each worker at each period, we estimate the probability of 
promotion, based on the promotion model estimated earlier. We then 
see whether the worker is promoted or not. For those workers who 
are not promoted (the disappointed), we “look into the future” and 
see if and when they quit. We continue to follow those workers as long 
as they stay in the firm and are not promoted. The hypothesis tested is 
that the higher the level of disappointment (defined earlier), the 
higher the likelihood of observing an early quit. It should be noted 
that the structure of the data set (one observation each year) does not 
allow us to observe those workers who quit very early. The reason is 
that we estimate the probability of promotion for the year interval 
and define “no promotion” only if the worker stayed in the firm until 
the next survey. Therefore, the workers who expect, with high proba¬ 
bility in the beginning of the period, to be promoted and are not 
might quit during the period, and for those workers one cannot say 
whether they were not promoted. 

We now present a nonparametric measure to test our model. Let 
Y u = 1 if individual i is promoted between t - 1 and t, and 0 other¬ 
wise; P u = prob(F„ = 1 ) be the expected probability that the worker 
will be promoted, based on observed characteristics at t - 1 (using the 
estimated coefficients reported in table 1 , col. 1 ); Q,= 1 if i quit 
between t + j - 1 and t + j, and 0 if not; and D„ = (P„|y„ = 0) be 
defined as the level of disappointment. 

The hypothesis is that the rank correlation between D it and-Q,,r+, is 
positive and will decrease as j is increasing. In other words, corr(D„, 
Q.,t+i) > corr (D,„ Qm + 2 ) > corr(D,„ Q M + s) > corr {D„, The 

number of surveys in the data set allows us to observe a maximum of 
four periods to the future. The estimation results using Spearman 
correlation are reported in the following table. 


D w 

Number of observations 


Spearman Correlation Coefhcients 


Q».r +- 1 

Qt./ + 2 

Q», t + 3 

Q|. / + 4 

.0424 

.0094 

- .0093 

- .0241 

(.000) 

(.614) 

(.686) 

(.508) 

6,501 

4,774 

1,900 

758 


The numbers in parentheses are the probabilities that the true cor¬ 
relation is zero. These probability values are obtained by treating 
(n - 2) m p/{\ - p) m as coming from a (-distribution with n - 2 de- 
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grees of freedom, where p is the appropriate correlation. In our case, 
these values should be taken with a lot of caution. 

These results support our hypothesis. The fact that the correlation 
becomes negative beyond a certain point might be due to the presence 
of unobserved heterogeneity that is not firm specific, as discussed 
earlier. 

4. Duration Effects on Career Mobility 

Specific human capital and job-matching theories predict a negative 
effect of tenure on mobility . 15 The presented theory of career mobil¬ 
ity conversely predicts that there exists a positive effect of tenure in 
occupation on occupational mobility; individuals acquire skills and 
experience in one occupation in order to be able to move to another 
occupation. An empirical test for the presence of a positive duration 
effect on career mobility has to be conducted that controls for firm- 
specific investment. Such a test can be performed by considering in¬ 
trafirm mobility. 

Since the PSID does not report tenure in occupation, the analysis is 
limited to the effects of time in the labor force (experience) and time 
with the present employer (tenure). Replacing firm tenure with ten¬ 
ure in position did not alter the results reported here. 

The results reported in table 1 indicate that the effect of tenure on 
career mobility depends on whether or not the worker changes firms. 
In column 9, which combines intra- and interfirm mobility, the effect 
of tenure on mobility is negative. However, column 1 shows a positive 
effect of tenure on the promotion probability. This observation sup¬ 
ports the hypothesis that skills and experience accumulated in prior 
occupations increase the likelihood of moving to a higher-level occu¬ 
pation. 


IV. Summary 

This paper analyzes the role as well as the significance of the phenom¬ 
enon of occupational mobility in the labor market focusing on indi¬ 
viduals’ careers. The study provides an additional dimension to the 
existing analysis of prominent labor market phenomena including 
investment in human capital, differences in wage profiles across indi¬ 
viduals, and interfirm mobility. Occupational mobility, defined as a 
change in tasks performed on the job, is analyzed as an integral part 
of the worker’s career path. 

It is shown that more educated individuals have careers that involve 


15 Jovanovic’s (1979) matching model predicts an initial positive duration effect. 
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a smaller number of distinct occupations and therefore are less likely 
to change occupations (and firms). Within a given occupation, how¬ 
ever, more educated individuals are more likely to move to a higher- 
level occupation, within or across firms. This observation explains the 
variations in the returns to schooling across occupations. In those 
occupations in which the returns to schooling (in terms of wages) are 
lower, the effect of schooling on career mobility is larger. 

The rate of career mobility decreases with time in the labor market. 
With higher levels of experience, career mobility is more likely to 
occur within the firm (promotion) than across firms. Within the firm, 
firm tenure has a positive effect on career mobility. This observation 
confirms the proposition that skills and experience accumulated in 
one job/occupation are transferable to other occupations along the 
worker’s career. 

As was demonstrated in the theory presented in this paper, individ¬ 
uals’ optimal career path may involve intrafirm mobility as well as 
interfirm mobility. Intrafirm career mobility (promotion) is subject to 
the employer’s decision, whereas interfirm mobility and its optimal 
timing are determined by the individuals who choose the optimal 
quitting time so as to maximize their expected lifetime earnings. In¬ 
trafirm career mobility is uncertain. The probability of promotion is a 
function of schooling, ability, and job experience. The optimal invest¬ 
ment in human capital and the optimal quitting time maximize the 
individual’s expected lifetime income. The optimal quitting time for 
individuals who were not promoted occurs earlier than that for indi¬ 
viduals who were promoted. It is shown empirically that among work¬ 
ers who were not promoted, those with a higher probability of promo¬ 
tion are more likely to quit the firm. The higher the probability of a 
promotion, the earlier they quit. 


Appendix A 

Occupational Classification Used in the PSID 

Two-Digit Classification 16 

10 Physicians (medical and osteopathic), dentists 
18 Judges, lawyers 

11 Other medical and paramedical 

14 Teachers (college), social scientists, librarians, and archivists 

15 Architects, chemists, engineers, and physical and biological scientists 
13 Teachers, primary and secondary schools 

17 Public advisors 

12 Accountants and auditors 

lh These are ranked by the level of human capital required to work in the occupation 
(see App. B). The occupational codes are those used in the PSID. Occupation 73. 
private household workers, was not ranked because of zero observations. 
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20 Managers, officials, and proprietors (except farm), not self-employed tl 
19 Professional, technical, and kindred workers not listed above 
16 Technicians 
45 Sales workers 

31 Like 20, self-employed (unincorporated businesses) 

50 Foremen not elsewhere classified 

80 Farmers (owners and tenants) and managers 

52 Government protective service workers (fire, police, marshals, and con¬ 
stables) 

55 Members of the armed forces 

40 Secretaries, stenographers, and typists 

51 Other craftsmen and kindred workers 

41 Other clerical workers 

61 Transport equipment operatives 

62 Operatives, except transport 
75 Other service workers 

70 Unskilled laborers (nonfarm) 

71 Farm laborers and foremen 


One-Digit Classification (Not Ranked) 

10-19 Professional, technical, and kindred workers 
20 Managers, officials, or proprietors 
30-31 Self-employed businessmen 
40-49 Clerical and sales workers 
50-52 Craftsmen, foremen, and kindred workers 
61-62 Operatives and kindred workers 
70-75 Laborers and service workers 
80 Farmers and farm managers 


Appendix B 

The Vertical Ranking of Occupations 

The objective of the following analysis is to construct an occupational index 
that will serve as an indicator for the amount of human capital needed to 
work in different occupations. Upward occupational mobility based on such 
an index will reflect an increase in the level of human capital, obtained 
through schooling, market experience, or other forms of on-the-job training. 

The index is derived by first regressing log earnings on a set of variables 
including, among other things, education, (a proxy for) labor market experi¬ 
ence prior to entry into the occupation, and the amount of training required 
to perform the job. An index of occupational level is computed for each 
occupation as a weighted average of the occupational means of these three 
variables, where the weights are the coefficients from the earnings function. 

A formal derivation of the ranking is now presented followed by the esti¬ 
mation results. A comparison with other alternative occupational indices is 
presented followed by a discussion of some of the advantages and weaknesses 
of our index. 

Consider the following wage regression: 

ln(lVy ( ) = X„p + aE, + tPEXP,,, + 6 TEN,,, + pRQT„, + e v „ (Bl) 
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where X is a vector of observed characteristics, E is the worker’s level of 
schooling, PEXP is market experience prior to entry into the present occupa¬ 
tion, TEN is tenure in the occupation, RQT is the amount of training the 
worker received in order to be fully qualified to work in the present occupa¬ 
tion, i is the individual’s index, j is the occupation index, and t is the time 
index. 

Define the level of human capital the worker needs in order to be qualified 
to work in the occupation as 

HC,j « aE, + tPEXP, ; , + pRQT,,,. (B2) 

Then the mean level of human capital needed to be fully qualified to work in 
occupation j is given by 

HCj = (B3) 

Nj 

and the vertical distance between occupations k and l is given by 

DV kl = HC„ - HC,. (B4) 

Since tenure in occupation is not reported in the PS1D, it was replaced by 
two alternative proxies, tenure in position and time in the labor force, which 
can be viewed as lower and upper bounds to tenure in occupation. Appendix 
A shows the resulting ranking. Estimates of equation (Bl) and human capital 
measures under different specifications are available on request. 


Occupational Status and Prestige 

Different scales of "occupational status” or “occupational prestige” have been 
developed by sociologists over the last 60 years. Different methods were used 
in constructing them, varying from surveys in which individuals were asked to 
rank occupations’ prestige on the basis of personal judgment to more analyt¬ 
ical methods that combined prestige with different measures of education 
and wages. Two examples of such indices are the Duncan socioeconomic 
status index and the National Opinion Research Center occupational prestige 
index developed by Paul Siegel and Robert Hodge. Both indices are highly 
correlated with the vertical ranking developed in this work (see App. table 
Bl). 


Discussion 

The major motivation in deriving the occupational index presented in this 
Appendix is that occupational upgrading is considered as mobility that is 
obtained through gaining skills and experience. Therefore, mobility to an 
occupation that pays higher wages as a compensation for bad working condi¬ 
tions, risk, and so forth or for other reasons such as unionism should not be 
considered as upward mobility. 

Although in deriving the index we implicitly assume that wages are the only 
form of compensation, the resulting index is superior to other alternatives 
such as the mean wages per occupation. By basing our ranking on the mean 
levels of human capital in the occupations, we allow some occupations to be 
ranked higher than what is reflected by reported wages, assuming implicitly 
other forms of compensation (see, e g., farmers). 

Our ranking is highly correlated with that obtained by the mean levels of 
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schooling (the rank correlation is .95). Nevertheless, it allows occupations 
reached mainly through experience and on-the-job training to be ranked 
higher than those based on schooling alone (see, e.g., managers and 
foremen). 

A major weakness of the ranking is that each occupation is an aggregation 
of more detailed occupations. Therefore, a worker might move to a higher- 
level occupation considering the detailed occupations but, on the basis of our 
ranking, will be observed moving to a lower-level occupation. Data limitations 
do not allow a more detailed (three-digit) occupational ranking. 

Our ranking might underrank occupations that are obtained through 
means of investment that are unobserved (by the econometrician) such as 
dedication and initiative. It might overrank occupations characterized by low- 
quality workers (relative to their observed characteristics). 

The major objective for deriving an occupational index is to define an 
occupational ranking. Considering the analysis for which the index is used, 
we find it superior to alternative rankings used in the literature. 
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Tax Changes and Phase Diagrams for 
an Overlapping Generations Model 


John Laitner 

University of Michigan 


The literature evaluating tax changes within an intertemporal gen¬ 
eral equilibrium framework subdivides into representative agent 
and overlapping generations formulations. Papers in the former 
class have developed techniques analogous to those routine for static 
analyses. I show that the same general approach works for the over¬ 
lapping generations model. The context is the system from Auer¬ 
bach and Kotlikoff's Dynamic Fiscal Policy. The proposed methodol¬ 
ogy allows one to examine stability and determinacy issues for the 
model, it deals precisely with small policy changes, and it can easily 
handle bundles of changes. I present comparative static results for 
comparison with existing work. 


A number of recent papers have studied potential tax changes in the 
context of intertemporal general equilibrium models. One approach 
summarizes the household sector with a representative agent. Several 
prominent examples are Judd (1985, 1987) (see also Hall 1971; Brock 
and Turnovsky 1981; Abel and Blanchard 1983; Becker 1985; Cham- 
ley 1985). Consider for a moment the simplest model in Judd (1985). 
It has a fixed labor supply, one type of capital, and one consumption 
good. With concave production and utility functions, the model’s 
phase diagram in capital-consumption space displays a saddle about a 
unique stationary point. History dictates a starting capital stock. As 
the analysis begins, the unique solution time path jumps to the con¬ 
sumption level on the saddle’s stable arm over the given initial capital 

I owe thanks to Alan Auerbach, Ted Bergstrom, Larry Kodikoff, David Levine, and 
Robert Lucas for helpful comments on drafts of this paper. I began this project as a 
guest at Universitat des Saarlandes. 
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stock, It then converges over time to the stationary state. To evaluate 
a surprise change in a tax parameter, we redraw the phase diagram 
and jump to the new stable arm. Judd characterizes his policy results 
with derivatives. 

A second approach uses an overlapping generations formulation. 
Individual families maximize private utilities over finite life spans; 
their separate optimization efforts plus market-clearing conditions 
determine the economy’s overall growth. There is, at minimum, 
heterogeneity among households based on date of birth. Auerbach 
and Kotlikoff (1987) make an extensive study of taxes within such a 
context, using computer simulations to evaluate a variety of policy 
initiatives. 

The purpose of the present paper is to reexamine the Auerbach 
and Kotlikoff model using a methodology that, resembling Judd’s, is 
based on marginal analysis. Several significant benefits follow. 

One benefit is that one can accurately evaluate the effects of small 
policy changes—possibly the most likely type to receive serious con¬ 
sideration in practice. My procedure produces sequences of deriva¬ 
tives quantifying the displacement of equilibria over time after, for 
example, tax rate changes. The derivatives are strictly comparable to 
conventional comparative static results. 

A second benefit concerns the model’s phase diagram. As in Judd, 
there is a stationary solution. Since the analysis must keep track of the 
asset holdings of individual families, the dimension of the state space 
is large: 108 for Auerbach and Kotlikoff’s model. To study an econ¬ 
omy that remains in the vicinity of a stationary solution, we still want a 
saddle point. The stable arm must have sufficient dimensionality to 
allow us to reach the stationary state from historically given initial 
conditions; too much dimensionality, on the other hand, will imply 
indeterminacy. Unfortunately, nothing seems to guarantee a phase 
diagram with a particular nature; see, for instance, Calvo (1978), 
Laitner (1982), Kehoe and Levine (1985, 1988), and Woodford 
(1984) (and, for that matter, Judd [1987]). I shall take the attitude of 
Samuelson’s “correspondence principle”: in any given case of interest, 
one should check the phase diagram carefully. My methodology fully 
exposes the latter in the local vicinity of any given stationary solution. 
This segment of the analysis should attract the attention even of those 
exclusively interested in Auerbach and Kotlikoff’s specific results: 
their analysis both requires and assumes determinacy and stability. 

A third advantage of my approach is its ability to deal with very 
complex policy adjustments. Tax bills often cover many rates simulta¬ 
neously and make provision for phase-in intervals. My methodology 
allows changes to be examined separately and then combined linearly. 
Studying long phase-in intervals is straightforward. From a compuia- 
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tional standpoint, many policy evaluations can follow from a single 
saddle point characterization. 

This paper is organized as follows. Section I restates the basic 
model from Auerbach and Kotlikoff. Section II algebraically and 
graphically describes my approach. Section III reviews the type of 
saddle point phase diagram sought. Section IV presents numerical 
characterizations of the phase diagrams for a number of the examples 
in Auerbach and Kotlikoff. Section V derives a way of generating 
sequences of comparative static results—both for permanent and for 
temporary tax and debt parameter changes. It also provides numeri¬ 
cal illustrations for comparisons with existing work. Section VI con¬ 
cludes the paper. 


I. The Framework 


I begin with the Auerbach and Kotlikoff model, although the general 
procedure is independent of the model’s specific functional forms 
and elements. 

Individual families live 55 years (say, adult ages 20-75), receive no 
inheritance, and leave no bequest. Each has the same utility function: 
for a family born at time t, lifetime utility is 


U, = 


1 

1 - ( 1 /”/) 







( 1 ) 


where c, tJ and l, iS are consumption and leisure at age 5, 8 £ (0, 1) is the 
subjective discount rate, p > 0 is the elasticity of substitution between 
consumption and leisure, and y > 0 is the elasticity of substitution for 
utility between consecutive ages. Leisure, l,. 5t is normalized to fall be¬ 
tween zero and one (retirement). The current price of the single 
consumption good is always one. Let o ( i be family asset holdings at the 
start of age t; R, the interest rate during period /; W, the wage rate; t*, 
tJ* and Tp proportional tax rates for interest income, wages, and con¬ 
sumption; and e s “effective labor input” at age s. Then the family 
budget constraint corresponding to (1) is 

dtj + 1 = Ct r s ’[1 Rt — 1 + s ' (1 T f — 1+5)] + (1 Itj) ' € s ' Wt — 1 + , 


• (1 - ^-- 7 =-, alls = 1.55, 

1 T i-1 + j 
dt ,56 ~ 0 = «i.l- 

We must be careful to impose the constraint 

l tJ £ 1 all s. 


( 2 ) 

(3) 


(4) 
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(1 + »)* ( 5 ) 

families born at time /, Since this generates time trends for variables 
such as gross national product and the aggregate capital stock, the 
nomenclature below separates trend and a per capita measure of level 
for most variables. 

The economy is closed. The physical capital stock employed in 
production at time t, K t • (1 + n)‘, must have been built and financed 
at time t - 1 (or before). Hence, if the government debt at the start of 
period t is D, ■ (1 + n)', total productive physical capital during period 
t is 

. 55 

K, ■ (1 + n)‘ = [X(l + n)' +1 -' • fl, + I _ w - [D, • (1 + «)']. (6) 

A second aggregation formula gives total (“effective”) labor, L, • (1 + 
n)‘, supplied during period t: 

r>5 

l ,• (i +»)' = X c + »r ,_j • (i - 4 + i- w ) • ^ (7) 

i=i 

If Y, • (1 + n)‘ is real GNP, we have a constant elasticity of substitution 
aggregate production function 

Y t • (1 + n)‘ = A • {€ • [K t ■ (1 + n)'l l ” (1/ ‘ T) + (1 - e) 

■ [L, ■ (1 + n )'] | - (l '«' ) } ,/ "~<> / "», ' h! 

where A is a scaling constant, e gives capital’s factor-share parameter, 
and cr is the production elasticity of substitution. There is nonphysical 
depreciation. The term Y, ■ (1 + n)' is homogeneously divisible into 
consumption, investment, and government goods. Competitive be¬ 
havior prevails in the business sector so that R, and W, equal the 
marginal products of the right-hand side of (8) with respect to K, ■ 
(1 + «)' and L, ■ (1 + n)'. 

If G, • (1 + n)' is government spending and C, ■ (1 + n)‘ is aggregate 
private consumption, the government’s budget constraint is 

G, • (1 + n)' + D t ■ (1 + n)' • R, = K, ■ (1 + n)' • R, ■ t? 

+ D, ■ (1 + n)‘ ■ R, ■ T f 

+ L, ■ (1 + n)' • W, • T r (9) 

+ G, ■ (1 + n)‘ ■ J ' 

1 - T, 

+ D,+ , • (1 + n)' +1 - D, {\ + hf. 
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In the analysis below, G, = G, all t. In other words, one tax rate br ( 
debt change always offsets another. In particular, as in Auerbach and 
Kotlikoff, we start with D 0 = 0 = To, and any variation in t f or if, t S: 
0, or D, • (1 + n)‘, t ^ 1, is balanced with joint, equal percentage 
adjustments in tJ v and Tf at all subsequent dates that preserve (9) and 
G, ■ (1 + n)' = G • (1 + n)‘. 

Household utility maximization yields first-order conditions 


l, + 


1 —- J,S 


C t + 1 


• W, ■ (1 - tD • (1 - 




or 


i = 1. each s = 1, . . . , 55, 


a - rf: 


9U I+ i-. 


= [1 + • (1 - T? +1 )J • (1 - Tf +1 ) 


dU t+l . 


I — + I 


-, all s = 1.54. 


( 10 ) 


(ID 


To find a stationary solution, fix constant levels for G, tax rates, and 
Do- Choose a capital to “effective” labor ratio, say k. The marginal 
conditions associated with the production function then yield a wage 
and an interest rate. The latter are constant for all time in the sta¬ 
tionary state. Using them, solve the consumer maximization problem. 
Household optimal asset and leisure profiles together with (6) and (7) 
imply a time-invariant ratio k = K,IL t . Search for a fixed point to the 
mapping defined in this way—a value of k yielding k = k. Auerbach 
and Kotlikoff provide tables of stationary solutions corresponding to 
various parameter values, as well as discussions of their parameter 
choices. 

In a stationary state, K„ L t , and Y, are constant; R, and W t are 
constant; and a,+ i _ l,+ , and c,+ \ are independent of t. For 

future reference, let an asterisk signify a stationary solution so that k* 
is the stationary capital to labor ratio, R* and W* stationary-state 
factor prices, a *,+1 _ Jr , stationary-state asset holdings per family of age 
5, and so forth. 


II. Perfect Foresight Dynamics 

Let us now move beyond the analysis of stationary states, assuming 
perfect foresight. The approach in this section is related to that used 
for the simpler model in Laitner (1984). 1 


1 Laitner (1987), again in a simpler context, indicates a way in which this type of 
analysis can be extended to cover continuous-time models. See also Cass and Yaari 
(1967). 
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To illustrate the methodology, consider for a moment a very simple 
model of economic growth, 

*i + i = A(A„0), (12) 


with 8 an exogenous parameter and A, an endogenous variable. In this 
example, think of the latter as being the aggregate capital stock, with 
events prior to t having fixed its time t level. A stationary solution is a 
value k* = A*(8) such that k* = h(k*, 0). To characterize the efFect on 
k * of a permanent change in 0, differentiate and collect terms: 

dk* = dh(k*, eyae n3 

dd 1 - [dh(k*, 0)/dA]' 

Turning to adjustment paths, suppose we start at time 0 with the 
economy in a stationary state—parameter values having been con¬ 
stant for many periods—but at that moment we make a surprise, 
permanent change in 0. Differentiating (12) gives a first-order differ¬ 
ence equation for dk,/dQ: 


dki + 1 _ dh(k t , 0 ) dk, dh(k t , 8) 

~dQ dk "d8" 80 


(14) 


Because we have started with a stationary state A*. (12) implies h, = k*, 
all / a 0. Thus (14) simplifies to an equation with constant coefficients: 


dk, + ] _ dh(k*,Q) dk, dh(k*, 0 ) 
da ~ ak ' da aa 


The nature of the variable k, yields an initial condition dk {) /da = 0. 

Figure 1 shows how to interpret the derivatives that (15) and the 
initial condition generate. Consider any time t > 0. The original value 
of the parameter 0 was, say, 0„. Had we maintained it, k, would have 
equaled k*, as shown. If at time 0 we permanently alter 0 to 8 k , solu¬ 
tion of the original model would yield a value for k, of, say, A*. Simi¬ 
larly, a time 0 change to 0 r would have yielded A, = A f , and so on for 0 rf . 
The curve in figure 1 connects all these values for A,. As we find dk,/da 
along the stationary-state path using (15), we are finding the slope of 
the curve at 0 = 0 O . Despite the linearity of (15), there are no approxi- 
mauons involved: equation (15) yields the exact slope of the curve at 
(0 O . A*) and the precise analogue of the stationary-state comparative 
static result of (13). 

Returning to our overlapping generations model, suppose that we 
have a stadonary state, labeled with asterisked variables as in Section 
I. Let our analysis begin at dme 0. 

Totally differenliadng each family’s budget constraint, (2), and us¬ 
ing the fact that we are computing derivatives along a stadonary-state 




Fic. 1 .—k at time t for different values of 0 


growth path, we get 

da / + i-^+i =[1 + R* ■ (1 - r**)] • da, + ,- s-s + af + 

•(1 -r w *) 

■dl t+ + (1 - • e, • (1 - t w ") 


dW,~ (1 - l* l+l . s ,)-e i -W*-dT^ - 


dct+ i -s,j 


- • <tif, alii = 1__ 55; < > 0. 

Total differentiation of (11) along a stationary-state path yields 
dU t + 1_, , r , ,, /•*. ( d*U,+ i- t , 


■ dr? + (1 - O • (■ 


^1+ 1 -S.S<te<+ 1 — 4,5 


' --“7- + j 

1 -s.jO*f+ 1 -s„* 

= -[1 + **•(!- O] • • *f + , + (1 - O 

Wf+I -J r T+ 1 


[1 +**•(! -T**)] 


0Cj^. 1 — + 1 — j,j+ 1 


W +■ 1 - m + 1 


w+l -W + 1°*Y+ I “ 1 


+ (1 - T c *) • J U ‘ +l ~ s ~ - [(1 - .,*•) ■ dR,+1 - R* • *£.,]. 

+ 1 — Jr* + 1 


all s = 1.54. 
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Notice that the additive separability of preferences makes the deriva¬ 
tives of U,+ 1 -s with respect to c ,+1 and l t +1 _ lr , functions of c * + 1 
and /?+1 - SJ alone, and the derivatives with respect toc, + 1 - JiJ+1 and 
4 +1 -j,i + i functions of ct +1 - SJ +1 and /?+1 - w +1 alone. 

Define 


/fl(- 1.2 1 


1 X, 


1 tT 


i 

1 \ 

l,A 


t? 

X, - 

'a,- 54.55 

’ X> m (c/ —54,55)’ = 


, s, = 

Tj ; • 


C,,\ 

\ / 

It- 54,55 


D, 




, / 


\ M 


j c,-a 8.54 j 


Recall that a,j = 0, all t. Differentiating the marginal product condi¬ 
tions for factor prices and the aggregation formulas ( 6 ) and (7), we 
can characterize dW t and dR, in terms of dZ, and dS,, any t 2 0. So for 
any such t, (16) and (17)—with the case s = 55 omitted from (16) for 
the moment—yield a block of 54 + 54 constant coefficient linear 
equations of the form 

Hf ■ dZ, + , + Hi ■ dS l+ , + H;f • dZ, + H.f • dS, = 06 R WH . (18) 

Totally differentiating (10), we have 55 more (constant-coefficient) 
linear equations in dZ, and dS„ any l a 0 . 2 Using the latter block for l 
and the corresponding block for t + 1, we can eliminate the 55 
dl t + |_ tiJ variables of dZ, and the 55 dl, + 2 -s.s variables of <iZ,+ i-in (18). 
Equation system (18) then becomes 

Hf • d \,+, + Hi • dS ,+, + Hf • dX, + Hi • dS, = 0 G R lm . 

We can also use the right-hand side of (16) at s — 55 plus the fact that 
o<- 55.56 = 0 (hence da ,_ 55 S6 = 0) to eliminate one more variable from 
dXi and one more from dX (+1 , enabling us to write our difference 
equation system in terms of dx, instead of dX,. 

Finally, as stated, to comply with the government budget constraint 
and dG, = 0, all I, we constantly move components of diY and rfrf. 


* Notice that unless we assign contrived parameter values, odds are zero that inequal¬ 
ity (4) will be on the exact margin of binding. Thus, for each s, <U, f , = 0 or the first 

part of (10) will hold. (For noninfinitesimal policy changes, we would have to worry 
about [4] beginning and ceasing to bind.) 
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say dfy w and drf. If we use the implied equation at time t, and agaimat 
t + 1, in place of dT, vv , dtf, dfT+ )t and df? + j and define 

td^\ 

df* 

dT, *» dS, — 0 , 

0 

0 

our basic system of equations becomes 

Hf • dx, + , + H| • dT, + , + H| ■ dx t + H} • dT, - 0 E /I 108 , 

with H? and H 8 108 x 108 matrices and and Hij 108 x 5. Multi¬ 
plying through by negative the inverse of H? yields a system 

dx, + l = M • dx, + N] ■ dT, + N 2 ■ dT, + I( all t ^ 0, (19) 

with M 108 x 108 and N t and Na 108 x 5. Because we focus on 
deviations from a stationary state, the elements of all three matrices 
are time autonomous, as in (15). 

We can now write our basic dynamic equation. Let 0 be a tax or debt 
variable permanently changed at time T, the announcement being 
made at time 0. (Section V considers temporary changes as well.) We 
shall use the notation 


drf_ = r] ail t ^ T if 0 = t c 
d8 jo otherwise, 

and similarly for the other elements of dT, and dT, + 1 . Then (19) 
implies 



This is our version of (15). The endogenous variables are dx,/d0, all t 
> 0 . 


III. Phase Diagrams 

The Introduction began with a brief description of the saddle point 
dynamics of a simple representative agent model. This section dis¬ 
cusses system (20) in similar terms. Let us start with the matter of 
initial conditions. 

Consider the endogenous variables in (20): 


dx, ^ ( dx i, 
d0 \ d0 ’ " 


dx log,, ^ 

d« /’ 


all t a 0. 
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The first 54 elements of this vector consist of the asset holdings of 
living families (except the youngest, who have none) carried over 
from the preceding period. When government announces (at time 0) 
policy changes, these components of x 0 cannot vary. In other words, 
they are “predetermined” variables, giving us a set of initial condi¬ 
tions 

-4^- = 0 for t = 0, I * 1.54. (21) 

dv 

The remaining elements of dxo/dQ are consumption figures. His¬ 
tory does not fix them; they correspond to the variables in the repre¬ 
sentative agent example that could "jump.” Thus our analysis must 
determine dx i0 /dQ, all» = 55, .... 108, as well as dx,/rf6, all t > 0. 

Computing stationary-state comparative static results is straightfor¬ 
ward: if I is the identity matrix and dT*/d0 gives the eventual perma¬ 
nent change in exogenous variables, (20) implies 

— = [I-M]-'-[N, + N 8 1 (22) 

This is the analogue of (13). Use the notation dx,/dB ™ (£ I( , £ 2 ,) and 
dx*m - ({,*. &), i u , it, & e R>\ 

Figure 2 shows a phase diagram exactly analogous to the represen¬ 
tative agent case (with each axis now corresponding to a space of 
dimension 54). Initial condition (21), ^, 0 = 0, puts us on the ordinate. 
Point 5 gives a stationary solution after policy changes announced at 
time 0 have all taken effect. In the simple representative agent model 
mentioned in the Introduction, the optimal solution began at point A 
and followed the stable arm of the saddle thereafter. 
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We face two problems: (1) neither government nor any single 
agent’s optimization seems to compel starting at any particular point 
above £10 = 0, for example, point A\ and (2) the structure of the 
problem does not seem to guarantee figure 2’s saddle about S in the 
first place.* 

Consider issue 1, assuming figure 2. If we mandate a start at point 
A, the graph determines foo — a - Equilibrium and perfect foresight 
then require a transit along the stable arm to S. For a finite policy 
change A6, 

x, = x* 4- • A0, all t a: 0, 

a0 

with dxildQ coming from (20). In terms of figure 1, if A0 is not too 
large, we stay in the vicinity of k* at t. Then knowing (from [20]) 
figure I’s (precise) slope at k* is very useful. That is not true if we start 
at B or C, in which event the model takes us away from S. 

The paths from B and C may not be feasible: a high starting value 
for c, 1 , for instance, may leave families born at time t unable 55 years 
later to satisfy simultaneously first-order condition (11) and the con¬ 
straint a,, 56 = 0, and in such cases no perfect foresight equilibria begin 
at either B or C. 4 Conceivably the path AS is a unique equilibrium. 
The model with two investment goods (“predetermined” or “histor¬ 
ical variables”) and corresponding prices ("nonhistorical variables”) of 
Shell and Stiglitz (1967) is an example: Shell and Stiglitz systematically 
disqualify (as a perfect foresight equilibrium) each nonconvergent 
dynamic path. 

Unfortunately, the high dimensionality of our system limits the 
analysis to the vicinity of S. Thus, following virtually all the rational 
expectations literature, we shall assume that if we have the saddle 
point of figure 2, the model takes path AS. We might argue that in 
practice the U.S. economy does seem to follow the pattern of a model 
that remains close to a stationary solution, giving importance to S and 
to the dynamic analysis with starting point A. Or we might argue that 
a simplification of the problem of coordinating agents’ anticipations 
about the economy’s future time path favors AS. 

This paper’s procedure does make a contribution with regard to 
issue 2: it provides a way of checking whether, locally, we have the 


5 As stated, these issues are not absent from Auerbach and Kotlikoff's analysis. They 
are not necessarily avoided by any but the simplest representative agent models (see 
Judd 1987). 

* Mathematically, the matrices HJ in the lines above (19) are functions of the econo¬ 
my's state variables, hence, now of lime. The infeasibility discussed above might show 
up in the form of a singular version of matrix H? appearing within a finite number of 
periods. 
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saddle of figure 2. There are three basic cases. The first is the illus¬ 
trated one. 

Case 1. In terms of (20), the matrix M has 54 distinct, nonzero 
eigenvalues X,, t = 1,.... 54, with |X,| < 1 and 54 distinct eigenvalues 

X„ i = 55. 108, with (Xj > 1. The projection of the linear space 

generated by the eigenvectors corresponding to X„ i = 1...., 54, onto 
the space of predetermined initial conditions (see [21]) has dimension 
54. 

To understand the first part of the definition, notice that the gen¬ 
eral solution to (20)—with all policies in effect at time T —has the 
form (recall [22]) 


dx t 

dB 


dx* 

de 


4* W] 




(Xi)' + . . - + u>io8 • t»io8 ' (^loa/- all t > T, 

(23) 


where v, £ R WH is the eigenvector corresponding to X„ and u>, E /?i is 
an arbitrary weight. Given case 1, convergence (to S in fig. 2) requires 

(o, = 0, i = 55. 108. (24) 

Initial conditions (21) then pin down c*>„ i = 1.54, uniquely, given 

case l’s second restriction/' 

A second case, which might be called “instability,” precludes reach¬ 
ing S. 

Case 2. The matrix M has distinct, nonzero eigenvalues X„ i = 
1.108; |X,| > 1 for more than 54 indices i. 

In terms of (23), in this case to leave enough (*>,’s conceivably to have 
d\ {) /dQ satisfy ( 21 ), we are forced to have, for one or more i, w, 0 and 
\K\ > 1 —causing explosive, rather than convergent, growth. * 

The third category is often labeled “indeterminacy." 

Case 3. The matrix M has distinct, nonzero eigenvalues X„ i = 
1, . . . , 108; |X,| < 1 for more than 54 indices. 

In general, in this case we can put zero loadings on the explosive 
eigenvalues and still have more arbitrary u>,’s than initial conditions. 
In terms of figure 2, the convergent arm is too “thick": there are a 
continuum of points A consistent with £i 0 = 0 and with being on the 
saddle’s stable arm. 

As stated, Laitner (1982), Woodford (1984), and Kehoe and Levine 
(1985) review the literature of perfect foresight equilibrium models 
and find examples in each case/ We now check our model by deriving 
eigenvalues. 


1 Case l’s second restriction rules out having the saddle's.stable arm perpendicular to 
the abscissa in fig. 2. 

8 This analysis closely parallels the macroeconomic literature on linear rational ex¬ 
pectations models as well. See, in particular, Blanchard and Kahn (1980). 
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IV. Saddle Point Results 

This section generates the matrix M and computes eigenvalues. It 
does this for each of the parameter combinations in Auerbach and 
Kotlikoff’s “sensitivity analysis table,” table 5,3. (We also add trial 7, 
with the production elasticity of substitution equal to .6.) 

Auerbach and Kotlikoff solve their model using an iterative numer¬ 
ical method. The algorithm’s success at computing convergent equi¬ 
librium transition paths hints that the latter do not have the instability 
problem of case 2 above. A concern that case 3 may occur, or even 
predominate, remains, however. 

Table 1 presents eigenvalue calculations. The worry of the preced¬ 
ing paragraph turns out not to be supported: for each of the 12 
parameter combinations, exactly 54 of the 108 eigenvalues turn out to 
have absolute value less than one and 54 to have absolute value 
greater than one. Thus case 1 prevails for this particular model—at 
least for the specified parameter values and stationary solutions—and 
we can hold the saddle diagram of figure 2 in our minds as we pro¬ 
ceed. 

For economy of space, the table presents only summary informa¬ 
tion. In every trial, most eigenvalues are complex (typically all but 2- 
4). The eigenvalues with absolute value exceeding one ultimately re¬ 
ceive zero loadings in our analysis (see [23]-[24]), so only the smaller 
ones are of special interest. In the case of complex numbers, pe¬ 
riodicities associated with the latter range from several years to almost 
a lifetime. 

To develop one intuition for why case 1 emerges, with precisely 54 
eigenvalues of each type, consider two steps. Label the model of table 
1 “formulation A.” To derive “formulation B,” modify the model so 
that the capital to labor ratio iri production remains constant regard¬ 
less of the asset to labor ratio for the household sector. This may be of 
independent interest: it would be an appropriate model for a small 
country in a world with international capital flows and investors al¬ 
ways in the end paying domestic taxes even on their foreign earnings. 
Table 2 presents formulation B results. 

One more step yields formulation C: dispense with the treatment 
above of the government’s budget, so that after a permanent tax rate 
change, other rates or the government’s debt changes once, suffi¬ 
ciently to stabilize government revenues at new stationary-state em¬ 
ployment and asset accumulation levels. During the transition to the 
new stationary state, let G t vary now (in this one case) to preserve (9). 

In formulation C, we must have case I: with net factor prices exoge¬ 
nous, we can solve each household’s life cycle maximization problem 
separately: given concave utility, we know that the partial equilibrium 
setup has a unique solution. ' 



Saddle Point Configurations 
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Proceeding backward from formulation C to B, we add a sequence 
of income tax adjustments: households’ actions now affect the tax 
rates they face. (This is not to imply that any agent consciously manip¬ 
ulates tax rates; all individual households take them as given and 
beyond their control.) If the effects are small, we shall remain in case 
1. That is what happened in all the trials reported above. 7 

To move from formulation B to A, we add the potential for house¬ 
holds' savings and labor supply decisions to affect factor prices. If the 
latter are unresponsive or have little influence on household deci¬ 
sions, we can remain in case 1. What is perhaps surprising in this 
regard is that case 1 appears for every elasticity choice in table 1. 

To restate the point: The repeated occurrence of the 54 unstable- 
54 stable eigenvalue configuration in table 1 need not be viewed as 
random good luck. Such outcomes are inevitable in formulation C. 
Unless the complications of moving to formulation A disturb the 
underlying eigenvalues a great deal, case 1 will persist. 


V. Policy Simulations 

This section considers three types of policy: increases in the tax rate 
on interest income (hence, in our framework, wage tax reductions), 
increases in the taxation of consumption (coupled with lower income 
taxes), and increases in the national debt (coupled with short-term tax 
reductions). In each case, (20) generates a sequence of comparative 
static derivatives. This section shows that the methodology can handle 
permanent or temporary policy variations, or changes announced 
ahead of their implementation. Let us compare our outcomes with 
those of Auerbach and Kotlikoff and others. 

To simulate a permanent policy change announced and imple¬ 
mented at time 0, consider (23). Assume case 1. Then to attain the 
stable arm of our saddle, (24) must hold. We must choose the remain¬ 
ing 54 weights in (23) to make its right-hand side satisfy (21) at time 0. 
This last step yields 54 linear equations in 54 variables. The final part 
of the definition of case 1 ensures the existence of a unique solution. 
The Appendix presents details. 

Let us turn to a permanent policy change for time T announced at 


7 It is possible to think of ways of incorporating dynamic government behavior that 
would move us to case 2. Consider, e.g., setting tax rates once and for all at time 0 and 
letting the government s deficit absorb subsequent revenue fluctuations. A tax change 
leading to a positive deficit will increase future debt service obligations, possibly driving 
the deficit still higher and eventually leading to an explosive increase in the national 
debt (see Blinder and $ok>w 1973). 
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time 0. Equation (23) must hold for t a T. Assume case 1. Equation 
( 20 ) implies 


dx ' +1 - M • j*', all t - 0_ 

d0 d0 

, T - 

2, 

(25) 

dx, + ] dx, dT* 

d0 " M ' do + ' d0 ’ ' 

= T - 

• L 

(26) 

dx,+ 1 _ w dx, , X7 dT* , XT 
» M ■ m + Nl ' dt, + • 

dT* 
dO ’ 

t & T. 

(27) 


Equating the right-hand sides of (27) and (23) at t = T and using (25) 
and (26) to substitute into (27), we get 



0)i ■ • (X[) 7 + . . • + U>los ' I>108 ' (Xl()H) r 


= M M M r_ 1 


dx ( j 

d 0 


+ M • N 2 


dT* 

dO 


+ N, 


dT* 


+ N 2 • 


dT* 
dO ' 

(28) 


The unknowns are (i> ( , t or and dx o /d0. Line (24) pins down half 

of the former variables, and (21) covers half of the latter. Thus (28) 
gives us a block of 108 linear equations to solve for the 108 variables 
(wi.dx 53 , 0 /de,.... dx IO 8,o/d0) = (w u ..., w 54 , ^ 20 ). Perform¬ 

ing the required matrix manipulations, we can deduce fj 20 . Then 
dx,/d0, all t 2 : 1, follow from (25) and (23). 8 Again, see the Appendix. 

An important advantage of our approach is that once we have the 
derivative sequences for several different policy changes, say, d 0 4 , 
d 0 B , and d 0 o we can combine them: 


dx, = 


dx, 

de7 


, dx, , dx, , 
d$A + 1 + jd ' ^®c- 


d0« 


d 0 c 


(29) 


This enables us to use the steps above to evaluate a temporary policy 
change: consider such a change announced at time 0 , implemented 
then, and expiring at time T. It is equivalent to a permanent parame¬ 
ter modification at time 0 coupled with a permanent change of the 
same magnitude for the same parameter in the opposite direction at 
time T, both announced at time 0. Equation (29) allows us to find the 
joint effect. 

Let us turn to some sample numerical results. Figures 3-11 present 
simulations for formulations A and B with parameters as in the first 


8 Note that our linear algebraic steps here and in the next paragraph provide an 
alternative to Judd's procedure based on Laplace transformations. 
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Fic. 3.—Effect on wealth holdings of an increased tax on consumption 


row of table 1 (this is the primary parameter choice in Auerbach and 
Kotlikoff [1987]). 

Figures 3-5 provide results for an increase in the consumption tax 
rate (balanced by income tax rate reductions). Let us consider an 
immediate permanent change and a temporary change lasting 5 
years. 

The figures present elasticities. Notice that if G • (1 + n)' and C* ■ 
(1 + n)‘ are stationary-state, time t aggregate government spending 
and consumption, respectively, at these levels a consumption tax 
change of At c * G/C* could fully replace the income tax (recall that 
initially t c = 0). The permanent policy changes of figures 3 and 4 
graph 

dK, At c j dL, At c 
; a • - and — -7T- ■ -. 

<K K* difi L* 

Obviously many other ways of presenting results are possible; this one 
should yield numbers roughly of commensurate magnitude to those 
in Auerbach and Kotlikoff. 

Figure 5 presents corresponding welfare numbers: let dEJE, be the 
percentage increase in lifetime resources (the latter equaling the pres¬ 
ent value of a given family’s time endowment in the initial stationary 
state) needed in theoriginal policy regime to make each family dying 
at time t exacdy as well off (in terms of lifedme utility) as it would be 
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Fig. 4.—Effect on labor supply of an increased tax on consumption 
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Fic. 5.—Welfare gain for an increased tax on consumption 
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Fic. 6.—Effect on wealth holdings of an increased tax on interest income 


under the new policy. In the case of a permanent tax change, figure 5 
graphs 

dE, At*' 

rfr£ ' E, • 

Figures 3—5 are consistent with the existing literature: as in Sum¬ 
mers (1981) and Auerbach and Kotlikoff (1987), an increase in the 
taxation of consumption tends to imply large increases in overall capi¬ 
tal accumulation. As in Auerbach and Kotlikoff, the magnitude of the 
eventual welfare gains is about 2.5 percent (for formulation A). In 
fact, their results at all ages are both quantitatively and qualitatively 
similar to ours. 9 Generations old at the time of the tax announcement 
tend to be hurt: their labor years are over, but their consumption is 
high. Reduction in the consumption possibilities of these cohorts 
causes overall saving to rise, moving the economy closer to its “golden 
rule" capital intensity over time—and future generations benefit. 

Figures 6-8 repeat the analysis for increases in the rate of taxation 
on capital income. The “permanent change" cases of figures 6 and 7, 
for example, graph 

dK, At* j dL, At* 
thg ‘ K* 30 ~djf ' 1*~' 

* Compare fig. 5 with Auerbach and Kotlikoff s fig. 5.4, noting that the abscissa of 
the fatter measures year of birth rather than of death. 
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Fig. 7.—Effect on labor supply of an increased tax on interest income 
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Fig. 8.—Welfare gain for an increased tax on interest income 
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. To* • W* • L* + Tp • R* • K* 

A R* ■ K* 

Figure 6 agrees fairly closely with the literature: taxing interest 
income more heavily adversely affects the size of K t , the quantitative 
impact being large. (However, our drop from a permanent tax 
change never exceeds about 20 percent with formulation A, whereas 
Auerbach and Kotlikoff's eventual change is greater than 35 per¬ 
cent.) Somewhat surprisingly, figure 8 does not correspond with Au¬ 
erbach and Kotlikoff: they find a long-run welfare reduction of about 
1 percent, while figure 8 points to a long-run increase of about 2 
percent. Their figure 5.4 shows the same general pattern as figure 8 : 
a higher t* causes dEJE, to fall (with date of birth), then to rise, then 
to fall (for cohorts born after date 0). In our case the middle-run rise 
is great enough to leave the asymptotic level positive. In fact, in our 
model the long-run decline in capital accumulation is counterbal¬ 
anced with an increase in the labor supply that causes GNP actually to 
rise slightly. Apparently the mitigation of labor-leisure distortions 
accompanying a reduction in r* is more important for us than the 
intertemporal inefficiencies promoted by increases in t*. Auerbach 
and Kotlikoff’s results can be consistent with ours, but only if the 
welfare curves analogous to figure 1 are not monotonic . 10 

Actually, the discrepancies between figure 8 and Auerbach and 
Kotlikoff’s figure 5.4 are not as great as it might first appear, for 
Auerbach and Kotlikoff also consider an increase in wage taxation. In 
our model, such a change is exactly the same as lowering the tax on 
interest income . 11 Auerbach and Kotlikoff find a long-run .welfare 
reduction of about 1 percent from an increase in wage taxation. If we 
flip the curve in figure 8 about the abscissa (so that we are checking a 
wage tax increase), the sign conflict disappears entirely. The new 
graph looks qualitatively similar to Auerbach and Kotlikoff’s wage 
tax case in figure 5.4. In other words, the nonmonotonicity for the 
analogue of figure 1 required above is evident in Auerbach and Kot¬ 
likoff’s work alone. 

10 Judd’s (1987) results seem to agree with Auerbach and Kotlikoff’s: if one subtracts 
col. 1 from 2 (or 4 from 5) in his tables 1 and 2, long-run increases in -r R offset with 
decreases in r w virtually always yield welfare losses. However, the experiment is not 
really the same: Judd’s representative agent formulation, roughly speaking, measures 
welfare with a (discounted) sum of all the figures in fig. 8. Recall that Summers’s (1981) 
model, which points to the apparent desirability of lower taxes on capital, differs from 
ours as well: he did not allow labor supply variability, which, as we have just noted, 
seems to play a key role in our findings. 

11 Auerbach and Kotlikoff, on the other hand, are making noninfinitesimal changes 
with a nonlinear system. 
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TABLE 3 

Long-Run Effect of a Permanent Change in t* 


Trial 

Parameter 

Changes 

dE* At* 
dr* E* 

1 

None* 

.020 

2 

y - .10 

.043 

3 

y = .50 

.012 

4 

p = .30 

.021 

5 

p = 1.50 

.018 

6 

<r = .80 

-.002 

7 

ct = .60 

-.040 

8 

<T = 1.25 

.036 

9 

6 = .05 

.017 

10 

8 = -.03 

.020 

11 

a = .50 

.035 

12 

a = 3.0 

.012 


•7- 25 p - 80,o - I 00,5 = 015, a — ! 50, e » .25. A (sec cq ffl)) - 892657593, 
t" - .15. T* » - 15 , T c . 0, D - 0. and n - 0 


Table 3 pursues this result another step: it provides welfare elas¬ 
ticities comparing stationary states for each of the 12 parameter com¬ 
binations of table 1. The long-term welfare gains of a switch to 
heavier taxation on interest income persist in all cases for which the 
production elasticity of substitution is greater than or equal to one. 

Figures 9-11 cover national debt. The “permanent changes” of 
figures 9 and 10, for example, graph 

dK, G dL, G 

dD } ' K* dD\ ' L* ' 

(In other words, the experiment is to deficit-finance all time 0 govern¬ 
ment spending and hold D, - Dy, all t 2: 1.) 

Figure 9 reveals the same surprising initial “crowding-in” effects 
that Auerbach and Kotlikoff find. This cannot come solely from 
young families raising their current saving in anticipation of future 
taxes for debt service because the present value of the latter falls short 
of the current debt increment, given finite family life spans. The rise 
in labor supply (stemming from deficit-financed tax reductions at the 
moment of debt increase) also militates against it; leisure and con¬ 
sumption are substitutes here. It is the first effect combined with 
anticipated future interest rate increases—from a smaller future capi¬ 
tal to labor ratio—that must provide the explanation. Government 
debt crowds out physical capital in the long run virtually unit for unit. 

In terms of welfare, increases in government debt benefit only 
cohorts alive at the time of the change. After 55 years, as the cohorts 
alive during the period of lower taxes and increased deficits pass 
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Fic. 9.—Effect on wealth holdings of an increased government debt 
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Fig. 1 1.—Welfare gain for an increased governmem debt 


away, only higher taxes (for debt service in the permanent policy 
change case) and a lower physical capital stock remain, and welfare 
registers a sharp decline. 

If we compare figure 11 and Auerbach and Kotlikoff’s table 6.2 
and note that our permanent change corresponds to their 1 -year tax 
cut case but is scaled to three times the magnitude, the quantitative 
agreement is striking. 


VI. Conclusion 

I have checked the eigenvalue constellations for the Auerbach and 
Kotlikoff stationary solutions and found neither instability nor inde¬ 
terminacy (in 12 trials). The simulations are, for the most part, in 
close agreement with existing work, one possible exception being the 
long-term welfare effects of increases in the tax burden on interest 
income. 

The underlying model of this paper is, of course, subject to many 
criticisms. For example, on the production side, it omits costs of ad¬ 
justment, technological progress, and depreciation of physical capital. 
On the household side, it leaves out age-related changes in family 
composition, inheritances and bequests, nonnegativity constraints on 
asset holdings, and uncertain life spans. There are no random shocks 
in the economy. The model has no provisions for Keynesian unem¬ 
ployment. 
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Nevertheless, I have demonstrated a methodology for assessing 
small policy changes within the context of a rather complicated over¬ 
lapping generations model. The substantially different results for for¬ 
mulations A and B apparent in figures 3—11 underline the impor¬ 
tance of there being a general equilibrium framework. Many of the 
items in the preceding paragraph could be incorporated into the 
analysis. The procedure does enable one to check for both instability 
and indeterminacy, and it can handle elaborate combinations of 
simultaneous policy adjustments with comparative ease. 


Appendix 

A. Algebraic Steps for a Permanent Policy Change Announced and Begun 
at Time 0 

Initial conditions imply £io = 0. Lines (23) and (24) yield, at time t = 0, 



(Al) 


Form a 108 x 54 matrix V, the ith column of which is the eigenvector v,. 
Partition V into two 54 x 54 submatrices: 



Define 


to — 

Then (Al) in matrix form is 

0 = + V,, • w, (A2) 

€20 = (if + V 21 1 w. (A3) 

The last part of the definition of case 1 makes V 11 invertible. Thus (A2) yields 
a unique o>. Equation (A3) subsequently gives | 2 o, and (23) gives dx,/d6, all 
t 2 1 . 



B. Algebraic Steps for a Permanent Policy Change Announced at Time 0 and 
Begun at Time T > 0 

Using all 108 eigenvectors of M, form a matrix V, with column i being v,. 
Form a diagonal matrix A from the eigenvalues A, of M, with element (t,«) of 
A being K- Let <0 be as in step 1. Partition both V and A into four 54 x 54 
submatrices: 
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By construction, M • V = V • A. Hence, 

M r+ 1 = V ■ (A) r+1 ■ (V) -1 . 


Lines (21), (24), and (28) yield 

(a) + v - ( * r '(?)-»•<*>'*'■< v >-'-(£) + i > 


n 

(A4) 

(A5) 


where J 6 /? 108 is a known vector; see (28). Multiply through first by V *, then 
by A -7-1 , then by V. This yields 


V ■ (A) -T-1 •(V)- 1 


(&) 

+ ( v " 

(An) 1 • M\ 

UiJ 

lv„ 

(An) -1 ■ at) 


-Q + v-'A)"'" 


(A6) 


In the first 54 equations in system (A6), the only unknowns are to. Having 
nonzero eigenvalues ensures the existence of (An) -1 . The last part of the 
definition of case 1 implies that V t i is invertible. Thus the first 54 equations 
yield a unique solution for at. The second 54 then give Line (25) then 
yields dx,/d0 for l = 1, .... T - 1, and (23) gives rfx,/</0 for t ^ T. 
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Book Review 


Trade Policy in a Changing World Economy. By Robert E. Baldwin. 

Chicago: University of Chicago Press, 1989, Pp. 273. $59.95. 

This volume collects the author’s previously published papers on policy to¬ 
ward international trade. Most of them first appeared in the past 5 years, 
although older vintages include one from 1969 and two from 1976. They 
range from theoretical investigations (both positive and normative) into the 
bases for restricting trade to occasional pieces that review recent U.S. trade 
policy or explore multilateral trade confrontations and agreements. They 
reflect the author’s rich experience (agreeably detailed in a personal introduc¬ 
tory essay) as both a distinguished scholar of international economics and an 
active participant in U.S. international economic policymaking. They are uni¬ 
formly well reasoned, calm, dispassionate, pragmatic, patient, constructive, 
and somewhat bland. 

The central themes of all the papers remain relevant today, and only a few 
passages have grown dated. That property lauds the durability of Baldwin’s 
analyses, but it also bespeaks what one might call a low-level equilibrium trap 
in trade policy itself: plus (a change, plus c'est le mime chose. Economic analyses 
of trade policy broadly agree that valid cases for restricting trade in pursuit of 
long-run global welfare are rare, and optimal restrictions on behalf of long- 
run national welfare not much more abundant. With trade controls nonethe¬ 
less ubiquitous, the problem is to explain the political mechanism by which 
they arise and persist, and to suggest tactical maneuvers that hold out hope of 
Pareto improvement. In both tone and substance, Baldwin’s essays embrace 
this mainstream position. The recent intellectual fad of strategic trade policy 
gets only a passing glance; Baldwin probably finds little in its prescriptions 
that is likely to advance either national or global welfare through trade policy 
changes. 

The essays state a coherent view of recent trends in trade policy, both 
within the United States and internationally. Baldwin accepts that barriers 
increasingly constrict trade, and trade policy is an increasingly important 
source of friction among countries. He finds the diminished hegemonic role 
of the United States in the international system to be a major causal factor, 
but he stands on the optimistic side of majority opinion with regard to the 
efficiency cost of the protectionist trend. His relative optimism about the 
policymaking system is balanced, however, by pessimism about chances of 
stopping the erosion. He offers proposals for abating disputes over export 
subsidies and .special assistance to industries and replacing the withered dis¬ 
pute settlement capability of the General Agreement on Tariffs and Trade. 
They involve international recognition of the forms of assistance appropriate 
for national policy objectives, coupled with a return to item-by-item interna- 
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tional bargaining to promote elimination of inappropriate uses of assistance 
and internalize the negative external effects of legitimate ones. He does not 
seem sanguine, however, about their efficacy or prospects for acceptance. 

Why are Pareto-improving trade policies so elusive? Baldwin offers an 
eclectic political economy analysis of the determinants of U.S. trade policy 
(chaps. 3-6). It rests on standard assumptions: small beneficiaries (consum¬ 
ers) cannot match producer groups’ abilities to form coalitions in order to 
influence trade policy, and lump-sum compensation to bribe losers to accept 
changes that increase efficiency is largely infeasible. Consequently the protec¬ 
tion levels that industry groups can command increase with their sizes (voting 
strength, contributions) and their internal cohesion. He assigns only a minor 
role to rent-seeking outlays in determining trade policy (pp. 128-29), consis¬ 
tent with evidence (Esty and Caves 1983) that the (political) marginal product 
of a sector’s rent-seeking expenditures declines rather rapidly as its outlays 
increase. The broad course of trade policy also depends on conflicts with 
other political objectives, party ideology, and other extraneous factors. 

Although this account of the political economy of trade policy represents a 
mainstream consensus, it runs into substantia) anomalies. Why do a sector’s 
producers campaign for (increased) protection only when their quasi rents 
are being diminished by import competition? Why does a sector almost never 
obtain enough more protection to halt its displacement by competing im¬ 
ports? Why do producers’ pleas for protection emphasize parity with the 
protection enjoyed by their foreign competitors (which standard theory pro¬ 
claims is irrelevant to their own welfare)? Why do producers seek benefits that 
are provided inefficiently by trade restrictions (i.e., fall short of the benefits 
that they could obtain in other forms at the same cost to others)? Baldwin 
recognizes each anomaly and offers interesting but scattered comments about 
them. One wonders whether these anomalies will submit to a unified explana¬ 
tion resting on assumptions different from those of the standard political 
economy of trade policy. 

Make the following assumptions about the political system that determines 
levels of protection: (1) There exists a conservative social welfare function 
(Corden 1974, p. 107) that, as a matter of fair treatment, makes agents eligi¬ 
ble for compensation when their income streams (including appTopriated 
rents) fall short of their reasonable expectations. (2) The social welfare func¬ 
tion includes a nationalistic preference for economic transactions between 
fellow citizens that can lead the benefits of international transactions to be 
discounted. These assumptions, both espoused by Baldwin, can indeed ex¬ 
plain the preceding anomalies as well as other phenomena, although not to 
the exclusion of the standard approach. Consider the following implications. 

1. A sector’s success in obtaining protection should depend not on the 
benefits that protection can yield in long-run equilibrium but on the 
threatened quasi rents that it can preserve. That the impairment of quasi 
renu by international competition chiefly drives quests for protection is rec¬ 
ognized by many statutes and is evident in practice. That protection is sought 
to relieve distress is consistent with observation (p. 219) that protection rarely, 
if ever > halts the decline of a sector pressed by import competition. It thus 
preserves wasting quasi rents but sustains few permanent rents. There are, 
however, ways of reconciling the prevalence of distress as a condition for 
increased protection with the standard rent-seeking model. Such a crisis may 
ease the affected producers’ task of overcoming the free-rider problem 
(p. 112), and the benefits from investing to preserve impaired quasi rents will 



BOOK REVIEW 283 

not be dissipated by the entry of new competitor^ as a grab for rents would 
be (p. 105). 

2. Because fairness exists in the eye of the beholder, producer groups will 
invoke high protection of their foreign competitors as an argument for their 
own protection if perceived fairness is important in determining trade policy 
decisions and if producer democracy can achieve acceptance as a form of fair 
treatment. Baldwin notes the popularity of this argument and its recent en¬ 
trenchment in mainstream U.S. official policy, as well as its prominence in the 
campaign against unconditional most-favored-nation treatment. The calls for 
protection to attain fairness to producers conform to the assumptions pro¬ 
posed above, but the association is subject to two important qualifications. 
First, the quest for fairness to producers is sometimes presented not as a plea 
for protection but as a threat-cum-promise strategy for moving toward more 
liberal policies both at home and abroad (chap. 15). Indeed, its efficacy for 
this purpose cannot be ruled out if a country can credibly commit to the 
strategy, but the credibility of commitments is as central a problem here as in 
implementing any theoretical prescription of strategic trade policy. Second, 
where intraindustry trade is prominent, the producer-democracy position 
gains some of the economic sense that it otherwise lacks. Increased import 
competition impels rationalizing responses by competing domestic producers 
that will have higher expected returns if at the same time net revenues from 
exports are rising (because of reduced protection abroad), so that domestic 
producers can indeed benefit directly from the reduction of their rivals' pro¬ 
tection. This connection makes sense only for differentiated products, of 
course, but many sectors apparently adjust to international competition as if 
the requisite differentiation were present. 

3. The assumptions taken together imply that the restriction of imports will 
be chosen to relieve sectoral misfortunes even when imports are not the main 
source of distress; nor is restriction the cheapest way to provide benefits. 
Although evidence confirming this corollary is too abundant to mention, 
Baldwin points to important qualifications. Implementation of the potentially 
efficient alternative of direct subsidy faces daunting incentive problems (pp. 

110—11). Also, in the United States during the 1960s the promise of adjust¬ 
ment assistance as an alternative to protection bought substantial support for 
trade liberalization, which then eroded as the narrowness of grounds for 
access to assistance became evident (pp. 58-59). 

4. In the context of fair treatment, the preference for domestic transactions 
implies that producers should have less success gaining public assistance to 
preserve their export markets than to repel import competition. Baldwin 
suggests that common observation confirms this corollary. However, he also 
observes that fairness, as a criterion for trade policy, has the major drawback 
that treatment regarded as fair by one country’s citizens may be viewed as 
unfair by voters in other countries with different values or perceptions. For 
example, export-increasing subsidies seem a fair way to implement a conser¬ 
vative socal welfare function when an exporting industry’s fortunes meet an 
unforeseen (unfair) reverse, but the same subsidies seem unfair to affected 
competitors abroad. 

5. The unity generally displayed by a sector’s capital and labor in seeking 
protection is consistent with the fair treatment mechanism and its implicit 
assumption of switching costs or sectoral specificity of sunk investments (both 
skills and physical capital). It is not consistent with the standard long-run 
general equilibrium model (pp. 121-23). 
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6. Fairness as a criterion for trade policy can generate inconsistencies in 
national policy as well as international conflicts. The conservative social wel¬ 
fare function calls for temporarily protecting a sector suffering losses or 
unemployment from import competition, especially on the presumption that 
the sector thereby gains time or resources to adjust. The protection itself 
tends to restore the invaded quasi rents and thus remove the rationale for the 
protection. As Aggarwal, Keohane, and Yoffie (1987) argued, policy based on 
perceived fair treatment can exhibit such oddities as cycles in levels of protec¬ 
tion. Assistance is removed once normal profits are restored, but producers’ 
incomes again shrivel if adjustment has in fact not occurred, warranting the 
restoration of protection. 

In conclusion, an approach to trade policy based on assumed contents of a 
social welfare function can explain a lot, although not to the exclusion of the 
standard approach that emphasizes equilibrium protection levels determined 
by individual utility maximization and rent-seeking behavior. The com¬ 
prehensive papers in Baldwin’s collection, among their many merits, provide 
grist for reflecting broadly on the mechanisms driving the choices of trade 
policy. 

Richard E. Caves 

Harvard University 
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Our estimates of the pay-performance relation (including pay, op¬ 
tions, stockholdings, and dismissal) for chief executive officers indi¬ 
cate that CEO wealth changes $3.25 for every $1,000 change in 
shareholder wealth. Although the incentives generated by stock 
ownership are large relative to pay and dismissal incentives, most 
CEOs hold trivial fractions of their firms' stock, and ownership levels 
have declined over the past 50 years. We hypothesize that public 
and private political forces impose constraints that reduce the pay- 
performance sensitivity. Declines in both the pay-performance rela- 
'tion and the level of CEO pay since the 1930s are consistent with this 
hypothesis. 


The conflict of interest between shareholders of a publicly owned 
corporation and the corporation’s chief executive officer (CEO) is a 
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classic example of a principal-agent problem. If shareholders had 
complete information regarding the CEO’s activities and the firm’s 
investment opportunities, they could design a contract specifying and 
enforcing the managerial action to be taken in each state of the world. 
Managerial actions and investment opportunities are not, however, 
perfectly observable by shareholders; indeed, shareholders do not 
often know what actions the CEO can take or which of these actions 
will increase shareholder wealth. In these situations, agency theory 
predicts that compensation policy will be designed to give the man¬ 
ager incentives to select and implement actions that increase share¬ 
holder wealth. 

Shareholders want CEOs to take particular actions—for example, 
deciding which issue to work on, which project to pursue, and which 
to drop—whenever the expected return on the action exceeds the 
expected costs. But the CEO compares only his private gain and cost 
from pursuing a particular activity. If one abstracts from the effects 
of CEO risk aversion, compensation policy that ties the CEO’s welfare 
to shareholder wealth helps align the private and social costs and 
benefits of alternative actions and thus provides incentives for CEOs 
to take appropriate actions. Shareholder wealth is affected by many 
factors in addition to the CEO, including actions of other executives 
and employees, demand and supply conditions, and public policy. It 
is appropriate, however, to pay CEOs on the basis of shareholder 
wealth since that is the objective of shareholders. 

There are many mechanisms through which compensation policy 
can provide value-increasing incentives, including performance- 
based bonuses and salary revisions, stock options, and performance- 
based dismissal decisions. The purpose of this paper is to estimate the 
magnitude of the incentives provided by each of these mechanisms. 
Our estimates imply that each $1,000 change in shareholder wealth 
corresponds to an average increase in this year’s and next year’s salary 
and bonus of about two cents. We also estimate the CEO wealth conse¬ 
quences associated with salary revisions, outstanding stock options, 
and performance-related dismissals; our upper-bound estimate of the 
total change in the CEO’s wealth from these sources that are under 
direct control of the board of directors is about 75( per $1,000 
change in shareholder wealth. 

Stock ownership is another way an executive’s wealth varies with 
the value of-the firm. In our sample CEOs hold a median of about 
0.25 percent of their firms’ common stock, including exercisable stock 
options and shares held by family members or connected trusts. Thus 
the value of the stock owned by the median CEO changes by $2.50 
whenever the value t>f the firm changes by $1,000. Therefore, our 
final all-inclusive estimate of the pay-performance sensitivity— 
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including compensation, dismissal, and stockholdings—is about 
$3.25 per $1,000 change in shareholder wealth. 

In large firms CEOs tend to own less stock and have less compensa¬ 
tion-based incentives than CEOs in smaller firms. In particular, our 
all-inclusive estimate of the pay-performance sensitivity for CEOs in 
firms in the top half of our sample (ranked by market value) is $1.85 
per $1,000, compared to $8.05 per $1,000 for CEOs in firms in the 
bottom half of our sample. 

We believe that our results are inconsistent with the implications of 
formal agency models of optimal contracting. The empirical relation 
between the pay of top-level executives and firm performance, while 
positive and statistically significant, is small for an occupation in which 
incentive pay is expected to play an important role. In addition, our 
estimates suggest that dismissals are not an important source of mana¬ 
gerial incentives since the increases in dismissal probability due to 
poor performance and the penalties associated with dismissal are both 
small. Executive inside stock ownership can provide incentives, but 
these holdings are not generally controlled by the corporate board, 
and the majority of top executives have small personal stockholdings. 

Our results are consistent with several alternative hypotheses; 
CEOs may be unimportant inputs in the production process, for ex¬ 
ample, or their actions may be easily monitored and evaluated by 
corporate boards. We offer an additional hypothesis relating to the 
role of political forces in the contracting process that implicitly regu¬ 
late executive compensation by constraining the type of contracts that 
can be written between management and shareholders. These polit¬ 
ical forces, operating both in the political sector and within organiza¬ 
tions, appear to be important but are difficult to document because 
they operate in informal and indirect ways. Public disapproval of high 
rewards seems to have truncated the upper tail of the earnings distri¬ 
bution of corporate executives. Equilibrium in the managerial labor 
market then prohibits large penalties for poor performance, and as 
a result the dependence of pay on performance is decreased. Our 
findings that the pay-performance relation, the raw variability of pay 
changes, and inflation-adjusted pay levels have declined substantially 
since the 1930s are consistent with such implicit regulation. 


I. Estimates of the Pay-Performance Sensitivity 

We define the pay-performance sensitivity, b, as the dollar change in 
the CEO’s wealth associated with a dollar change in the wealth of 
shareholders. We interpret higher b ’s as indicating a closer alignment 
of interests between the CEO and his shareholders. Suppose, for 
example, that a CEO is considering a nonproductive but costly “pet 
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project” that he values at $100,000 but that will diminish the value of 
his firm’s equity by $10 million. The CEO will avoid this project if his 
pay-performance sensitivity exceeds b — .01 (through some combina¬ 
tion of incentive compensation, options, stock ownership, or probabil¬ 
ity of being fired for poor stock price performance) but will adopt the 
project if b < .01. 


Incentives Generated by Cash Compensation 

The pay-performance sensitivity is estimated by following all 2,213 
CEOs listed in the Executive Compensation Surveys published in 
Forbes from 1974 to 1986. These surveys include executives serving in 
1,295 corporations, for a total of 10,400 CEO-years of data. We match 
these compensation data to fiscal year corporate performance data 
obtained from the data files of the Compustat and the Center for 
Research in Security Prices (CRSP). After observations with missing 
data are eliminated, the final sample contains 7,750 yearly “first dif¬ 
ferences" in compensation and includes 1,688 executives from 1,049 
corporations. Fiscal year stock returns are unavailable for 219 of the 
7,750 observations; calendar-year returns are used in these cases. 
(Deleting these 219 observations does not affect the results.) All mon¬ 
etary variables are adjusted for inflation (using the consumer price 
index for the closing month of the fiscal year) and represent 
thousands of 1986 constant dollars. 

Table 1 summarizes estimates of the relation between CEO cash 
compensation and firm performance as measured by the change in 
shareholder wealth. Column 1 of table 1 reports estimated coeffi¬ 
cients from the following least-squares regression: 

A(CEO salary + bonus), = a + (^(shareholder wealth),. (1) 

1 he change in shareholder wealth variable is defined as r,V,_ )( where 
r, is the inflation-adjusted rate of return on common stock realized in 
fiscal year t, and V,_ j is the firm value at the end of the previous year. 

Our measure of firm performance is subject to two qualifications. 
First, performance should be evaluated before compensation expense, 
and yet r,V t _ j is the change in firm value after compensation expense; 
the associated bias in our estimates is small, however, because CEO 
pay changes are tiny relative to changes in firm value. Second, our 
measure of performance ignores payments to capital. When capital is 
an important input, a better performance measure is r,F,_ i - f,K t - 1 , 
where/, and K,~ i are the risk-free interest rate for period t and the 
opportunity cost of the capital stock at the beginning of period t. Since 
f and shareholder fetum r tend to be uncorrelated, this adjustment 
will not substantially affect our estimates. Fama and Schwert (1977) 
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TABLE 1 ,i 

Estimates of Pay-Performance Sensitivity: Coefficients of Ordinary Least 
Squares Regressions of A(Salary + Bonus), AITotal Pay), and A(Pay-relatkd 
Wealth) on Current and Lagged A(Shareholder Wealth) 


Dependent Variable 
(in Thousands of 1986 Constant Dollars) 


Independent 

Variable 

A(Salary + Bonus) 

AfTolal Pay) t 

(3) 

Total Pay + 
PV[A(Sa)ary 
+ Bonus)]’ 
(4) 

0) 

(2) 

Intercept 

31.7 

30.8 

36.6 

918.0 

Change in shareholder 
wealth (thousands of 
1986 dollars) 

.0000135 

(8.0) 

.0000139 

(8.4) 

.0000235 

(5.2) 

.000197 

(9.7) 

Change in shareholder 
wealth in year t - 1 


.0000080 

(5.5) 

.0000094 

(2.4) 

.000103 

(5.8) 


.0082 

.0123 

.0041 

.0157 

Estimated pay-perfor¬ 
mance sensitivity, b * 

.0000135 

.0000219 

.0000329 

.000300 

E-statistic for b 

64.0* 

93.0* 

28.5* 

117.7* 

Sample size 

7,750 

7,688 

7,688 

7,688 


NoTr —The sample is construe led from longitudinal data reported m Forbr\ on 1,668 CEOi serving in 1,049 
firms for the years 1974-86 ^shareholder wealth) is defined as the beginmng-of-period market value multiplied by 
the inflation-adjusted rate of return on common stock, (-statistics arc in parentheses 

* Significant at the 0.01 percent level 

f The Forbts definition of total compensation typically includes salary, bonus, value of restricted stock, savings and 
thrift plans, and other benefits but does not include the value of stock options granted or the gains from exercising 
stock options. 

* Present value based on the assumption that the CEO receives salary and bonus increment until age 70 at a 
discount rate of 3 percent 

* Estimated b is the sum of the coefficients on the contemporaneous and lagged shareholder wealth change 


find an R 2 of .03 between nominal riskless rates and 1-month returns 
on a value-weighted portfolio of New York Stock Exchange (NYSE) 
firms. 

The coefficient on the shareholder wealth variable of b = .0000135 
in column 1 is statistically significant (t = 8.0), indicating a positive 
relation between cash compensation and firm performance. The eco¬ 
nomic significance of the estimated coefficient is low, however. The 
coefficients in column 1 imply, for example, that a CEO receives an 
average pay increase of $31,700 in years in which shareholders earn a 
zero return and receives on average an additional 1.35< for each 
$1,000 increase in shareholder wealth. These estimates are compara¬ 
ble with those of Murphy (1985, 1986), Coughlan and Schmidt 
(1985), and Gibbons and Murphy (1990), who find a pay- 
performance elasticity of approximately .1: salaries and bonuses in- 
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crease by about 1 percent for every 10 percent rise in the value of the 
firm. Converting this estimate of the pay-performance elasticity to 
absolute dollars by multiplying by the median pay to value ratio of 
0.057 percent (calculated for the 9,976 CEO-years in the Forbes sam¬ 
ple for 1974-86) yields an estimated coefficient b = .000057, which is 
larger than, but consistent with, the estimate in column 1 of table 1. 

The median annual standard deviation of shareholder wealth 
changes for firms in our sample is about $200 million, so the average 
pay change associated with a stockholder wealth change two standard 
deviations above or below normal (a gain or loss of $400 million) is 
$5,400. Thus the average pay increase for a CEO whose shareholders 
gain $400 million is $37,100, compared to an average pay increase of 
$26,300 for a CEO whose shareholders lose $400 million. 

Equation (1) assumes that current stock price performance affects 
current compensation, and yet the timing of performance payments 
is often ambiguous. At the simplest level, bonus decisions may be 
made before final fiscal year earnings data are available. In other 
cases boards may know this year’s earnings, but the earnings and 
stock price changes available at the end of the fiscal year may not 
correctly incorporate the effects of managerial actions during the 
year. In addition, bonuses reported in proxy statements sometimes 
represent bonuses paid for performance in the previous year, and the 
proxies do not always clearly specify when the bonus payment year 
differs from the bonus measurement year. 

Column 2 of table 1 reports coefficients from the following regres¬ 
sion, which allows current pay revisions to be based on past as well as 
current performance: 

A(CEO salary + bonus), = a + b ^(shareholder wealth), 

( 2 ) 

+ 6 2 A(shareholder wealth),- 1 - 

The coefficient for year t - 1 is positive and statistically significant, 
indicating that last year’s performance does matter in the determina¬ 
tion of this year’s pay revision. The sum of the coefficients, b = b t + b 2 
= .0000219, is statistically significant (F = 93.0), suggesting that the 
CEO receives a total pay revision of 2.2y for each $1,000 change in 
shareholder wealth. We cannot tell how much of this effect represents 
a real lag of rewards on performance and how much represents sim¬ 
ple measurement errors caused by lags in reporting. We also estimate 
the relation with three years of lagged shareholder wealth changes 
with little difference from the results reported in column 2 of table 1; 
the coefficients on the contemporaneous and first lagged perfor¬ 
mance variables -are essentially unchanged and those on the second 
and third lags are small in magnitude and statistically insignificant. 
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We reestimate the regression in column 1 of fable 1 using 2- and 3- 
year differences; the results are quantitatively unchanged from those 
in the table. We also reestimate the regression in column 2 of table 1 
after including year dummy variables and separate intercepts for 
each sample CEO, and the estimated coefficients and their sum are 
virtually identical to those reported in the table. To allow the pay- 
performance sensitivity to vary across CEOs, we also estimate separate 
regressions for each of 717 sample CEOs with five or more observa¬ 
tions. The median estimated 2-year pay-performance relation for the 
sample of individually estimated coefficients is b = .000073, or a 
median pay raise of 7.3( per $1,000 increase in shareholder wealth. 

The regressions in columns 1 and 2 of table 1 are based only on the 
CEO’s salary and bonus, but CEOs receive compensation in many 
additional forms, including deferred compensation, stock options, 
profit-sharing arrangements, stock grants, savings plans, long-term 
performance plans, and other fringe benefits. The Forbes surveys in¬ 
clude data on many of these other components of compensation. 1'he 
surveys do not, however, include stock option data prior to 1978, and 
after 1978 the surveys report gains from exercising options but do not 
report the value of outstanding options or the value of stock options 
granted during the year. 

Column 3 of table 1 reports the relation between total compensa¬ 
tion and firm performance based on the Forbes total compensation 
data, excluding both stock option grants and the gains from exercis¬ 
ing stock options. The Forbes definition of total compensation varies 
somewhat from year to year but in general includes salary, bonus, 
value of restricted stock, savings and thrift plans, and other benefits. 
The sum of the estimated coefficients of current and lagged change in 
shareholder wealth is b = .0000329, indicating that total compensa¬ 
tion changes by 3.3tf for each $1,000 change in firm value. 

The dependent variable in column 3 of table 1 represents the 
change in the current cash flows accruing to the CEO, while the in¬ 
dependent variables represent the discounted present value of the 
change in all future cash flows accruing to the shareholders. A mea¬ 
sure of the change in CEO wealth that is more consistent with the 
measure of the change in shareholder wealth is current compensation 
plus the discounted present value of the permanent component of the 
change in current compensation. Suppose, for example, that CEOs 
receive only a base salary and that firm performance is rewarded by a 
permanent shift in the base salary. Then the appropriate measure of 
the change in CEO wealth is salary + PV(Asalary), where PV(Asalary) 
is the present value of the salary change from next year through the 
year in which the CEO leaves the firm. 

Measuring the discounted present value of a change in current 
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compensation is difficult for several reasons. First, Forbes reports only 
the sum of salaries and bonuses, and while it may be appropriate to 
include PV(Asalary) in the measure of A(CEO wealth), it is less clear 
that PV(Abonus) should be included since bonuses may be transitory 
and not permanent components of income. In addition, assumptions 
must be made regarding the number of periods remaining over 
which Asalary will be realized. Even when the firm has a 65-year 
mandatory retirement age, there is some probability that the CEO will 
leave the firm before age 65. At the other extreme, pension benefits 
are generally based on average salaries received during some period 
shortly before retirement; consequently an increase in salary may 
increase pension payments to the CF.O long after the CEO leaves the 
firm. 

The dependent variable in column 4 of table 1 is A(CEO wealth), 
measured as 

A(CEO wealth) = total pay + PV[A(salary + bonus)]. 

The present value of the salary and bonus increment is calculated 
assuming a real interest rate of 3 percent per year. In order to get an 
upper bound on the estimate of the pay-performance sensitivity, we 
assume that all changes in salary and bonus are permanent. We as¬ 
sume that the CEO receives the increment until age 70. If the CEO is 
younger than 70. we take the present value of his wage change until 
he reaches 70, but if he is older than 70, we assume that he is in his last 
year with the firm. 

The coefficients in column 4 imply that, on average, CEO wealth 
increases by $918,000 in years in which shareholders earn a zero 
return (the average CEO total pay excluding stock options- for the 
sample is $575,000). In addition, the estimate for b in column 4 im¬ 
plies that the CEO’s pay-related wealth (exclusive of stock options) 
increases by 30(t for each $ 1,000 increase in shareholder wealth. Thus 
the average pay-related wealth increase for a CEO whose sharehold¬ 
ers gain $400 million is $1.04 million, compared to an average annual 
wealth increase of $800,000 for a CEO whose shareholders lose $400 
million. 

Incentives Generated by Stock Options 

The Forbes definition of total pay excludes stock options, but stock 
options clearly provide value-increasing incentives for chief execu¬ 
tives. Year-to-year stock opdon grants provide incentives if the size of 
the grant is based on performance. More important, the change in 
value of unexercwed stock options granted in previous years also 
provides incentives. 
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To calculate a more complete measure of the CEO’s wealth change, 
which includes options, we analyzed the proxy statements from Mur¬ 
phy's (1985) sample of 73 Fortune 500 manufacturing firms during 
the 15-year period 1969-83. Data on stock options, salaries, bonuses, 
deferred compensation, and fringe benefits from these statements are 
used to construct a longitudinal sample of 154 CEOs. Total compen¬ 
sation is defined as the sum of salaries, bonuses, fringe benefits, 
the face value of deferred compensation unadjusted for the cost of 
restrictions on marketability and the time value of money, and re¬ 
stricted stock awarded during the year (valued at the end-of-year 
stock price). 

At the end of each year, CEOs typically hold stock options granted 
in different years at different exercise prices and exercise dates. The 
value of all options held by the CEO is calculated by applying the 
Black-Scholes (1973) valuation formula, which allows for continu¬ 
ously paid dividends (Noreen and Wolfson 1981; Murphy 1985). The 
value of options held at the end of year t is calculated as 

T 

X N, • [V- dT < HZt) - P t e~ rT <P(Z, - ctVT)], 

/ = o 

where N, is the number of options granted in year t at exercise price 
Pi, T is the number of months until expiration of these options, r is the 
average monthly market yield on 5-year government securities in year 
t, d is the dividend yield in year t - 1 defined as ln[l 4- (dividends 
per share/closing stock price)]/12, a is the estimated standard devia¬ 
tion of stock returns over the previous 60-month period, S T is the 
stock price at the end of fiscal year t, Z = {ln(S T // > ,) + [r - d + 
(CT 2 /2)]T}/(rVr, and $>(•) is the cumulative standard normal distribu¬ 
tion function. 

The change in the value of options held at the end of each year is 
calculated as the value of the options awarded during the year plus 
the change in the value of all outstanding options during the year plus 
the profits (price minus exercise price) from exercising options dur¬ 
ing the year. Data on actual exercise prices are not available; to get an 
upper bound on this measure, we assume that options are always 
exercised at the highest stock price observed during the year. 

Column 1 of table 2 reports least-squares regression results for the 
73-firm sample in which the dependent variable is the change in the 
value of the CEO’s stock options. The sum of the estimated coeffi¬ 
cients implies that the value of CEO stock options increases an aver¬ 
age of 14.5«l for each $1,000 increase in shareholder wealth. There¬ 
fore, the incentives generated by stock options are large relative to the 
incentives generated by annual changes in cash compensation (3.3tf 



TABLE 2 

Estimates of Pay-Performance Sensitivity Including Stockholdings and 
Options: Coefficients of Ordinary Least Squares Regressions of 
A(CEO Wealth) on ^(Shareholder Wealth) for CEOs 
in 73 Manufacturing Firms for 1969-83 



Dependent Variable (Thousands of 1986 Constant Dollars 

Independent 

Variable 

A(Value of 
Stock Options) 
(1) 

Total Pay + 
PV[A(Salary 
+ Bonus)] + 

A( Value of 

Stock Options) 

AfVaiue of 

Inside Stock)* 

+ Total Pay + 
PV[A(Salary 
+ Bonus)] + 

A(Value of 

Stock Options) 

(2) 

(3) 

(4) 

(5) 

Intercept 

79.4 

815.9 

816.1 

818.4 

892.9 

Change in 

.000105 

.000176 

.000174 

.00118 

.000198 

share- 
holder 
wealth ($ 
thousands) 

(8.6) 

(5.2) 

(5.0) 

(4.4) 

(3.7) 

Change in 

.000040 

.000131 

.000130 

.00031 

.000168 

share¬ 
holder 
wealth in 
year t - 1 

(3-3) 

(3.8) 

(3.8) 

(1.2) 

(3.1) 

CEO’s frac¬ 
tional own¬ 
ership X 
change in 
share¬ 
holder 
wealth 



.00294 

(.7) 


1.020 

(145.0) 

R s 

Estimated 

pay-per¬ 

formance 

.0807 

.0376 

.0381 

.0216 

.9610 

sensitivity, b 
F-staiistic for 

.000145 

.000307 

.000309* 

.00149 

.0020* 

b 

58.3* 

33.0* 

33.2* 

12.5* 

565.2* 


Non.—~Sampl« size is 877 for alt regressions ^(shareholder wealth) is defined as the beginning-of-period market 
value multiplied by the inflation-adjusted rate of return on common stock. &(value of stock options) includes profits 
from exercising options, value of options granted in current year, and the chsuige in the value of previously granted 
options based on Black and Scholes (1973). Total pay includes salary, bonus, value of restricted stock, savings and 
thrift plans, aod other benefits. PV[A(salary + bonus)J is based on the assumption that the CEO receives salary and 
bonus increment until age 70 at a discount rate of 3 percent Mtathtics are in parentheses. 

* Significant at the 0.01 percent level. 

* l n *ide stockholdings include shares held by family members and shares for which the CEO is a trustee or 
cotrustee without beneficial ownership. A(value of inside stock) is defined as the beginning-of-period value of inside 
Stock multiplied by the inflation-adjusted rate of return on common stock. Stock ownership data are unavailable for 
W of the (73 x 15) - 1,095 possible CEO-years. 

* Estimated b and related test statistic for a CEO with median fractional ownership for the sample, .0016. 
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per $1,000 from col. 3 of table 1) even though .options valued at date 
of grant account for a relatively small share of the CEO’s compensa¬ 
tion (8.1 percent for CEOs in the 73-firm sample). 

Column 2 of table 2 reports regression coefficients for the 73-firm 
sample in which the dependent variable is the change in all pay- 
related wealth, defined as 

A(CEO pay-related wealth) = total pay + PV[A(saiary + bonus)] 

+ A(value of stock options). 

The present value of the salary and bonus increment is again cal¬ 
culated assuming that the CEO receives the salary and bonus incre¬ 
ment until age 70 at a real interest rate of 3 percent per year. The sum 
of the estimated coefficients on the current and lagged shareholder 
wealth change variables of b = .000307 (F = 33.0) implies that 
CEO wealth changes by over 30< for each $1,000 change in share¬ 
holder wealth. 

To check on potential differences between the 73-firm sample and 
the Forbes sample, we reestimated the Forbes regression in column 2 of 
table 1 for the 73 manufacturing firms and obtained b = .0000196 
(compared to .0000219 for the Forbes sample). We also reestimated 
column 2 of table 2 after excluding stock options and obtained b = 
.0000163 (compared to .000300 as reported in table 1 for the Forbes 
sample). 


Incentives Generated by Inside Stock Ownership 

\ 

Stock ownership is another way that an executive’s welfare varies 
directly with the performance of his firm, independent of any link 
between compensation and performance. Although the process 
through which CEOs select their equilibrium stockholdings is not well 
understood, the incentives generated by these shareholdings clearly 
add to the incentives generated by the compensation package. Stock 
ownership data for the CEOs in the 73 firms in the manufacturing 
firm sample were obtained from the proxy statements; these execu¬ 
tives held an average of $4.8 million (in 1986 constant dollars) of their 
firm’s common stock in the period 1969-83. When we include shares 
held by family members and shares for which the CEO serves as a 
trustee or cotrustee, the average increases to $8.8 million. Year-to- 
year changes in the value of these holdings often exceed levels of total 
compensation by orders of magnitude (Lewellen 1971; Benston 1985; 
Murphy 1985). 

Column 4 of table 2 reports regression coefficients in which the 
dependent variable is a measure of the change in the CEO’s wealth 
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that includes the change in the value of his inside stockholdings. 
Changes in the value of inside stockholdings are calculated as the 
value of the shares held at the beginning of the fiscal year multiplied 
by the realized rate of return on common stock. To get an upper 
bound on the estimate, inside stock ownership includes shares held by 
family members and shares for which the CEO is a nonbeneficial 
trustee or cotrustee, as well as shares held directly. 

The sum of the shareholder wealth change coefficients in column 4 
implies that the wealth of CEOs increases (or decreases) by about 
$1.50 whenever shareholder wealth increases (or decreases) by 
$1,000. The difference between the estimated b in columns 2 and 4 
suggests that, on average, inside stock ownership plays an important 
role in providing managerial incentives. 

Our regression specification in column 2 of table 2 assumes that the 
pay-performance relation is the same for all executives, regardless of 
their stockholdings, but it is plausible that b is large and positive for 
executives with negligible stockholdings but small or even negative for 
executives with large holdings since their wealth may be tied “too 
closely" to the performance of their firms. We test for this potential 
heterogeneity by reestimating the regressions for the 15-year, 73-firm 
sample after including an interaction term, CEO’s fractional own¬ 
ership x A(shareholder wealth), to capture the effects of ownership 
on the sensitivity of pay to performance. 

The dependent variable in the regression in column 3 of table 2 
is the change in all pay-related wealth (including stock options but 
excluding stock ownership). The small and insignificantly positive 
coefficient of the ownership interaction variable (l = 0.7) implies that 
the relation between compensation and performance is independent 
of an executive’s stockholdings. The result that the pay-performance 
relation is not affected by stock ownership seems inconsistent with 
theory since optimal compensation contracts that provide incentives 
for managers to create shareholder wealth will not be independent of 
their shareholdings. 

The dependent variable in the regression in column 5 of table 2 is 
the change in CEO wealth, including all forms of compensation plus 
changes in the value of his individual shareholdings. The coefficient 
on the interaction term is highly significant (t = 145.0) and dose to 
unity, suggesting that the pay-performance sensitivity for a CEO with 
nonnegligible stockholdings is closely approximated by his fractional 
ownership. Since the total pay-performance relation is given by b = 
.000366 + 1.020 x fractional ownership, the sensitivity for a CEO 
who owns no stock is equivalent, on average, to stockholdings of 
0.0366 percent of the firm. The total pay-performancle sensitivity for 
a CEO with shareholdings of 0.16 percent (the median shareholdings 
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TABLES r 

CEO Inside Stock Ownership: Summary Statistics and Quintile Boundaries for 
Percentage and Value of CEO Stock Ownership for 746 CEOs Listed in 
1987 forbes Executive Compensation Survey, by Firm Size 



CEO Stock Ownership 
as Percentage of 

Shares Outstanding 

Value of CEO 
Stockholdings 
(1 Millions) 

All 

Firms 

0) 

Small 

Firms 

(2) 

Large 

Firms 

(3) 

All 

Firms 

(4) 

Small 

Firms 

(5) 

Large 

Firms 

(6) 

Mean 

2.42% 

3.05% 

1.79% 

$41.0 

$19.3 

$62.6 

Median 

.25 

.49 

.14 

3.5 

2.6 

4.7 

Quintile boundaries: 






Min 

less than .01% 

less than $0.1 


20% 

.05 

.11 

.03 

.7 

.5 

1.2 

40% 

.17 

.33 

.10 

2.5 

1.9 

3.3 

60% 

.42 

.73 

.20 

5.1 

3.6 

7.2 

80% 

1.38 

1.95 

.75 

17.4 

10.5 

22.6 

Max 

83.00 

83.00 

53.50 

2,304.2 

1,041.0 

2,304.2 

Median value of equity ($ millions): 

$1,200 

$580 

$2,590 


Note. —Stock ownership includes shares held by family members and also includes options that can be exercised 
within 60 days Small hrms have niaiket value below the sample median ($1.2 billion), large firms have market value 
exceeding the median 


for CEOs in the 73-firm sample) is equivalent to b = .0020, or $2.00 
per $1,000 change in shareholder wealth. 

Table 3 summarizes fractional stock ownership data for a much 
larger sample of CEOs. The 746 CEOs included in the 1987 Forbes 
Executive Compensation Survey hold an average of 2.4 percent of 
their firms’ common stock, including shares held by family members 
and options that can be exercised within 60 days. The distribution of 
inside stock ownership is skewed; the median CEO holds only 0.25 
percent of his firm’s stock. Twenty percent of the sample CEOs hold 
less than 0.05 percent of their firms’ stock, and 60 percent hold less 
than 0.42 percent. Small fractional ownership is even more prevalent 
in the largest Forbes firms (ranked according to market value), where 
80 percent of the CEOs hold less than 0.75 percent of their firms’ 
common stock. 

In dollar terms, table 3 shows that CEOs in the Forbes survey firms 
hold an average of over $40 million of their firms’ stock. Once again, 
the distribution is skewed: the median stock ownership is only $3.5 
million (compared to median 1986 total compensation of $700,000). 
The CEOs in large firms, while owning a smaller fraction of their 
firms’ common stock, tend to have a larger dollar investment in their 
firms’ shares. 
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Relation between CEO Turnover and Firm Performance: Estimated Logistic 
Models Predicting CEO Turnover Using Current and Lagged Net-of-Market 
Shareholder Return for CEOs Grouped According to Age 


Coepficievt Estimates, by Ace Group 


INDEPENDENT 

Variable 

Full 

Sample 

0) 

Leu than 

50 Year* 
Old 

(2) 

Between 50 
and 55 

(3) 

Between 55 
and 60 

(4) 

Between 60 
and 64 

(5) 

64 Year* 
or Older 
(6) 

Intercept 

-2.08 

-5,30 

-3 03 

-2.66 

-1.97 

-.442 

Current net-of-market 

- 6563 

-1.921 

-.3946 

- .5307 

-1.216 

-.2453 

return 

(-5.1) 

(-34) 

(-10) 

(-18) 

(-4.3) 

(-i.i) 

Lagged net-of-market 

-.4181 

-.6219 

-.0651 

-.2913 

-.5510 

-.5154 

return 

(-3.5) 

(-13) 

(-•*> 

(-1.0) 

(-2.1) 

(-2.3) 

Sample size 

9,291 

1,345 

1,935 

2,728 

2,171 

J.I12 

Number of 

CEO turnover* 

992 

47 

87 

174 

258 

426 

Significance of model 

.0001 

0021 

.5683 

.1046 

.0001 

0298 


Note. —The sample is constructed from longitudinal data reported in Forbes on 1,896 CEOs serving in 1,092 
firms for 1974-86 Net-of-market return is defined as the fiscal year shareholder return minus the value-weighted 
return of all NYSE firms The dependent variable is equal io one if tlie CEO is serving in his Iasi full fiscal year and 
lero otherwise. Asymptotic /-statistics are in parentheses. 


Incentives Generated by the Threat of Dismissal 

The threat of management dismissal for poor performance also pro¬ 
vides value-increasing incentives to the extent that managers are 
earning more than their opportunity cost. Recent studies by Cough- 
lan and Schmidt (1985), Warner, Watts, and Wruck (1988), and Weis- 
bach (1988) have documented an inverse relation between net- 
of-market firm performance and the probability of management 
turnover. These results suggest that managers are more likely to leave 
their firms after bad years than after good years and therefore are 
disciplined by the threat of termination. 

Table 4 reports coefficients from logistic regressions predicting the 
probability of CEO turnover as a function of firm performance for 
the 13-year sample of 2,213 CEOs listed in the Forbes surveys. We 
estimate the following relation: 


In - 


prQb(turnover) 

- prob(turnover) 


a + b j (net-of-market return) 


+ ^(lagged net-of-market return). 

The dependent variable equals one if the CEO is serving in his last 
full fiscal year and equals zero otherwise. The 1988 Forbes survey was 
examined to identify CEOs whose last fiscal year was 1986. The final 
CEO-year for firihs leaving the Forbes survey is excluded since we 
cannot determine whether or not this is the last year for that CEO. A 
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total of 582 firms were deleted from the Forints surveys during the 
1974-86 sample period. Of these, 293 are still “going concerns” as of 
1987, 214 were acquired by or merged with another firm (118 of 
these were acquired or merged within 2 years of the Forbes delisting), 
and 35 liquidated, went bankrupt, or went private. Current status 
data are unavailable for 40 of the 582 firms. 

Consistent with the previous studies, column 1 of table 4 shows that 
the probability that a CEO is serving in his last full fiscal year is 
negatively related to current and past firm performance as measured 
by the return realized by shareholders in excess of the value-weighted 
return on the common stock of all NYSE firms. If we convert the 
regression coefficients into estimated dismissal probabilities, the re¬ 
gression in column 1 implies that a CEO in a firm realizing returns 
equal to the market return in each of the past 2 years has a .111 
dismissal probability, calculated as p = e x /( 1 + e x ), where x = -2.08 
- .6363(net-of-market return) - .4181 (lagged net-of-market re¬ 
turn). The same CEO has a .175 dismissal probability when the firm 
earns a - 50 percent return relative to the market in each of the two 
previous years. Because it is usually impossible to tell whether the 
CEO was fired or simply quit or retired, the term “dismissal probabil¬ 
ity” is used only as shorthand for the more accurate “probability of 
CEO turnover.” 

The specification in column 1 of table 4 assumes that the relation 
between performance and turnover likelihood is the same for all ex¬ 
ecutives, but Vancil (1987) argues that CEOs are more likely to be 
fired when they are young than when they are closer to normal retire¬ 
ment. Columns 2-6 of table 4 report results from logistic dismissal 
regressions for CEOs grouped according to age: younger than 50, 
between 50 and 55, between 55 and 60, between 60 and 64, and 64 
years or older. The magnitudes of the coefficients are largest for the 
youngest CEOs, confirming Vancil's hypothesis that younger CEOs 
are more likely to be disciplined by turnover. The relation between 
turnover and performance is insignificant for 50-55-year-old CEOs 
and marginally significant for 55-60-year-old CEOs, suggesting that 
managers between the ages of 50 and 60 are unlikely to be dismissed 
subsequent to poor performance. The dismissal-performance rela¬ 
tion is highly significant for CEOs approaching retirement (between 
60 and 64) and marginally significant for CEOs at or past normal 
retirement age. 

The authors of the earlier studies documenting the dismissal- 
performance relation generally interpret their results as being consis¬ 
tent with the hypothesis that management termination decisions are 
designed to align the interests of managers and shareholders. Each 
author stresses, however, that managers are rarely openly fired from 
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their positions. Warner et al. (1988), for example, analyzed 272 firms 
for the years 1963-78 and found only a single case of an outright 
firing and only 10 cases in which poor performance was cited as one 
of the reasons for the separation. Weisbach (1988) examined 286 
management changes for 1974-83 and found only nine cases in 
which boards mention performance as a reason why the CEO was 
replaced. 

The data suggest that CEOs bear little risk of being dismissed by 
their boards of directors. The CEOs in our sample who leave their 
firms during the 13-year sample period hold their jobs an average of 
over 10 years before leaving, and most leave their position only after 
reaching normal retirement age. Of the sample CEOs, 60 percent are 
between 60 and 66 when they leave their firm; 32 percent are aged 64 
or 65. Moreover, CEOs seldom leave in disgrace. Vancil (1987) esti¬ 
mates that 80 percent of exiting (nondeceased) CEOs remain on their 
firms’ board of directors, and 36 percent continue serving on the 
board as chairmen. 

The infrequent termination of poorly performing CEOs does not, 
by itself, imply the absence of incentives since even a low probability 
of getting fired can provide incentives if the penalties associated with 
termination are sufficiently severe. Table 5 presents our estimates of 
the turnover-related penalties for poor performance for four hy¬ 
pothetical CEOs of various ages. Column 1 of table 5 shows the pre¬ 
dicted turnover probability (based on the estimated coefficients in 
table 4) for a CEO in a firm realizing exactly the market return in both 
the current and past fiscal years. Column 3 shows the predicted turn¬ 
over probability for a CEO in a firm realizing a - 50 percent net-of- 
market return in each of the past 2 years. A 46-year-old CEO, Tor 
example, has a .036 turnover probability after 2 years of 0 percent 
net-of-market returns but has a .116 turnover probability after 2 
years in which his firm earns 50 percent below market. 

Columns 2 and 4 of table 5 report the expected wealth losses associ¬ 
ated with dismissals for CEOs in firms realizing 0 percent and — 50 
percent net-of-market returns, respectively, in each of the two pre¬ 
ceding fiscal years. In order to obtain an upper bound on our estimate 
of the turnover wealth loss, we assume that the CEO has no alterna¬ 
tive employment opportunities and that his wealth loss on dismissal is 
the present value (at 3 percent) of $ 1 million per year starting the year 
after dismissal and lasting until the CEO is 66 years old. The expected 
wealth loss is calculated as this present value multiplied by the dis¬ 
missal probabilities calculated from table 4 and reported in columns 1 
and 3 of table 5. Column 5 reports the difference in the dismissal- 
related wealth loss associated with average performance (0 percent) 
and dismal performance (-50 percent), and column 6 compares the 
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TABLE 5 , 

Pay-Performance Sensitivity from CEO Dismissals: Implied Turnover 
Probabilities and Upper-Bound Expected Wealth Losses from 
Turnover for 46-, 53-, 58-, and 62-Year-Old CEOs 



CEOs tN 

CEOs in 




Firms Earning 

Firms Earning 




0% Returns 

- 50% Returns 


Estimated 

Pa v- Performance 
Sensitivity for 
CEO Dismissal 
with -30% 


Relative to 
the Market 
in Each of 
the Two 

Previous Years 

Relative to 
the Market 
in Each of 
the Two 

Previous Years 

Difference in 
Expected Wealth 
Loss FROM 
Turnover 


Turnover 

Expected 

Turnover 

Expected 

Net-of-Markft 

Return in Two 

CEO 

Probability f 

Wealth Loss* 

Probability f 

Wealth Loss* 

Returns 

Previous Years* 

Age* 

(i) 

(2) 

(S) 

(4) 

(5) 

<6> 


46 

.036 

$510,000 

.116 

$1,665,000 

$1,155,000 

60.0* 

per 

$1,000 

53 

046 

$459,000 

057 

$571.000 

$112,000 

8 . 6 * 

per 

$1,000 

58 

.065 

$407,000 

.095 

$595,000 

$188,000 

14.5* 

per 

$1,000 

62 

.122 

$346,000 

252 

$714,000 

$368,000 

26.4* 

per 

$1,000 


* Age* 46, 53. 5R, and 62 are sample average ages for CEOs less than 50, between 50 and 55, between 55 and 60, 
and between 60 and 64. respectively 

* Turnover probabilities for each age are calculated from the associated age group logistic regressions in table 4. 
1 Expected wealth loss is tabulated as the turnover probability multiplied by the present value of $1 million per 

year beginning next year and lasting until the CEO is 66 years old. All amounts are in 1966 constant dollars, and the 
real interest rate is assumed to lie 3 percent 

1 Based on $1.3 billion shareholder loss, which is the shareholder loss on an average-sire (SI.73 billion) hrm 
realizing -50 percent returns in two consecutive years 


CEO’s dismissal-related wealth loss with the wealth loss of sharehold¬ 
ers of an average-size firm ($1.73 billion in our sample), realizing 
a sequence of two net-of-market returns of -50 percent (i.e., a 
2-year cumulative return of — 75 percent). 

Table 5 predicts, for example, that the expected turnover-related 
wealth loss for a 62-year-old CEO in a firm realizing a 0 percent net- 
of-market return is $346,000, compared to an expected loss of 
$714,000 if his firm earns —50 percent below market in each of the 
two previous years. Although the difference in the expected wealth 
loss associated with dismal performance (compared to average per¬ 
formance) of $368,000 seems large, it is small compared to the CEO’s 
losses on his own stockholdings and trivial compared to shareholder 
losses. The CEOs in the 1987 Forbes survey between 60 and 64 years 
old hold a median of $3.2 million worth of stock, and therefore the 
stock market losses on a -75 percent return for a median CEO are 
$2.4 million. Moreover, shareholders lose an average of almost $1.3 
billion on a - 75 percent return; the CEOs’ expected dismissal-related 
losses of $368,000 imply that CEOs lose 28.4* for each $1,000 lost by 
shareholders. 

Column 6 of table 5 shows that our upper-bound estimate of the 
CEO’s dismissal-performance sensitivity for an average-size firm with 
a -75 percent 2-year return is 8.6? and 14.5*: fora 53- and 58-year- 
old CEO, respectively. We find a much larger dismissal-performance 
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sensitivity for a 46-year-old CEO—89.Oy per $1,000—but this result 
is driven by our inappropriate assumption that the CEO will never 
work again if dismissed but will work for his firm until age 66 if not 
dismissed. The dismissal-performance sensitivity for the 46-year-old 
CEO falls to 44. 5( per $1,000 if he accepts employment at half his 
current pay. 

Our estimates of the dismissal-performance sensitivity in column 6 
represent an upper bound for several reasons. First, we have assumed 
that CEOs leave the labor market after turnover; this assumption may 
be appropriate for older CEOs but is clearly inappropriate for very 
young CEOs. Second, table 5 is based on extraordinarily poor per¬ 
formance—2 years at - 50 percent per year—and the estimated dis¬ 
missal-performance sensitivity increases with shareholder losses. For 
example, the difference in expected wealth loss for a 62-year-old 
CEO earning 10 percent less than the market in two consecutive years 
(compared to 0 percent net-of-market returns) is $58,000, or about 
18tf per $1,000 (based on cumulative shareholder losses of 19 percent 
or $330 million for an average-size firm), compared to 28y per $1,000 
for the $1.3 billion loss in column 6 of table 5. Finally, most CEOs are 
covered by employment contracts, severance agreements, or golden 
parachute arrangements that further reduce or eliminate the 
pecuniary punishment for failure; and pensions, outstanding stock 
options, and restricted stock typically become fully vested on an in¬ 
voluntary separation. 

The dismissal-performance sensitivities in column 6 of table 5 can 
be added to the 30y per $1,000 pay-performance sensitivity in col¬ 
umn 2 of table 1 and the 15y per $1,000 pay-performance sensitivity 
for outstanding stock options in column 2 of table 2 to cortStruct an 
estimate of the total pay-performance sensitivity under direct Control 
of the board of directors. With an average dismissal-performance 
sensitivity (weighted by the number of observations in each age 
group) of SO? per $1,000, our estimate of the total pay-performance 
sensitivity—including both pay and dismissal—is about 75 V per 
$1,000 (b = .00075). Stock ownership adds another $2.50 per $1,000 
for a CEO with median holdings, for a total sensitivity of $3.25 per 
$1,000 (b “ .00325) change in shareholder wealth. 


II. Is the Small Pay-Performance Sensitivity 
Consistent with Agency Theory? 

Agency theory predicts that compensation policy will tie the agent’s 
expected utility to the principal's objective. The objective of share¬ 
holders is to maximize wealth; therefore, agency theory predicts that 
CEO compensation policies will depend on changes in shareholder 
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wealth. The empirical evidence presented in {Section I is consistent 
with this broad implication: changes in both the CEO’s pay-related 
wealth and the value of his stockholdings are positively and statisti¬ 
cally significantly related to changes in shareholder wealth, and CEO 
turnover probabilities are negatively and significandy related to 
changes in shareholder wealth. 

Although the estimated pay-performance sensitivity (with respect 
to compensation, dismissal, and stock ownership) is statistically signifi¬ 
cant, the magnitude seems small in terms of the implied incentives. 
Consider again our example of the CEO contemplating a pet project 
that reduces the value of the firm by $10 million. A risk-neutral CEO 
with median holdings (ft = .00325) will adopt the project if his private 
value exceeds $32,500, while a CEO with no stock ownership (ft =» 
.00075) will adopt the project if his private value exceeds $7,500. For 
comparison, the median weekly income of our sample CEOs is approx¬ 
imately $9,400. 

The purpose of this section is to explore whether our results are 
consistent with formal agency models of optimal contracting. Our 
task is made difficult by the fact that the theory offers few sharp 
predictions regarding the form of the contract other than predicting 
that wages generally increase with observed output. The formal mod¬ 
els do yield clear predictions regarding the pay-performance sensitiv¬ 
ity when the CEO is risk neutral. Given the impossibility of isolating 
the CEO’s marginal contribution to firm value, a risk-neutral CEO has 
incentives to pursue appropriate activities only when he receives 100 
percent of the marginal profits, or b = 1. The optimal contract, in 
effect, sells the firm to the CEO: he receives the entire output as 
compensation but pays the shareholders an up-front fee so that the 
CEO’s expected utility just equals his reservation utility. Jensen and 
Murphy (1988) show that the b = 1 contract that provides optimal 
incentives is also the contract that causes managers to optimally sort 
themselves among firms. 

Chief executive officers are not risk neutral; indeed, the major 
reason for the existence of the publicly held corporation is its ability to 
achieve efficiencies in risk bearing. By creating alienable common 
stock equity claims that can be placed in well-diversified portfolios of 
widely diffused investors, risk-bearing costs are reduced to a fraction 
of those borne by owner-managers of privately held organizations. 
Thus setting b = 1 in a risky venture subjects risk-averse executives to 
large risks, and setting ft < 1 to transfer risk from executives to share¬ 
holders generates costs from poor executive incentives. Optimal com¬ 
pensation contracts must reflect the trade-off between the goals of 
providing efficient risk sharing and providing the CEO with incen¬ 
tives to take appropriate actions. 
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Executives Are Risk Averse 

It is tempting to attribute the generally low pay-performance sensitiv¬ 
ity to CEO risk aversion, but the amount of income “at risk” for poor 
performance is a trivial percentage of the CEO’s total income. The 
total compensation pay-performance sensitivity of b = .0000329 in 
column 3 of table 1 implies, for example, that the pay revision associ¬ 
ated with a wealth change two standard deviations below normal (a 
shareholder loss of $400 million) is about $13,000. The median total 
compensation for CEOs in our sample is $490,000; therefore the 
amount of compensation “at risk” for a $400 million corporate loss is 
only 2.7 percent of the CEO’s total pay. 

It is more difficult to compare the amount of the CEO’s wealth at 
risk to his total wealth since we cannot calculate the CEO’s total 
wealth. Column 5 of table 2 implies, however, that a CEO’s wealth 
increases an average of $893,000 in years in which both the CEO and 
his shareholders earn a zero return on their shareholdings. In years 
in which shareholders lose $400 million, however, the wealth of a 
nonstockholding CEO increases by about $746,000, while the wealth 
of a large-firm CEO with median inside stockholdings increases by 
only $93,000.' In addition, the expected wealth loss associated with 
dismissal is approximately 30y per $1,000, or $120,000. Therefore, 
although the wealth effects of dramatically poor performance are 
substantial, they are not large relative to the normal $893,000 annual 
change in the CEO’s wealth, which is independent of performance. 


High Pay-Performance Contracts Are Not Feasible 

Highly sensitive pay-performance contracts may not be feasible even 
under risk neutrality since executives with limited resources cannot 
credibly commit to pay firms for large negative realizations of corpo¬ 
rate performance, and shareholders cannot credibly commit to huge 
bonuses that amount to “giving away the firm” for large positive real¬ 
izations. The numerical examples above, however, suggest that it 
would certainly be feasible to write binding contracts with a much 
larger share of income or wealth at risk. 

Moreover, successful entrepreneurs regularly sell off large equity 
claims, thereby lowering b\ avoiding such sales to maintain a high b is a 
feasible contracting strategy. Management buy-outs (MBOs), in which 
top managers take the firm private by borrowing large sums to re- 

1 This is calculated from col. 5 of table 2 as 893 + .0020 x (- 400,000), where .0020 
is the estimated pay-performance sensitivity for a CEO owning the 73-firm sample 
median of 0.16 percenTof his firm's common stock. , 
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purchase stock from public shareholders, are a, (feasible way to undo 
previous equity sales and are another way to accomplish high-6 con¬ 
tracts. For example, Kaplan (1989) finds in a sample of 76 MBOs that 
the median CEO holdings increase from 1.4 percent to 6.4 percent (6 
= .064), and median holdings for the management team as a whole 
increase from 5.9 percent to 22.6 percent (6 = .226). These high-6 
contracts not only are feasible but are growing in importance: MBOs 
of public corporations and divisions have increased from $ 1.2 billion 
in 1979 to almost $77 billion in 1987 (Jensen 1989). 

Franchising, accounting for 12 percent of gross national product in 
1986, is another feasible way to accomplish high-6 contracts (U.S. 
Department of Commerce 1987). These contracts are very similar to 
optimal contracts under risk neutrality that, in effect, sell the firm to 
the CEO. The franchisee pays a fixed entry fee for purchase of the 
franchise and receives all profits after payment of an annual fee to the 
franchisor that commonly amounts to between 5 percent and 10 per¬ 
cent of revenues. By granting the franchisee alienable rights in the 
franchise, these contracts resolve most of the horizon problem associ¬ 
ated with motivating managers to make correct trade-offs among cash 
flows through time (Jensen and Meckling 1979). This means that the 
franchisee has a 100 percent claim on the capital value of the fran¬ 
chise on its sale, although the alienability is subject to various restric¬ 
tions such as approval by the franchisor. Thus for these elements of 
changes in value the franchisee contract has 6=1. Franchise con¬ 
tracts have many other characteristics that reduce the conflicts 
of interest between the franchisee and franchisor and thereby re¬ 
duce the agency costs that result therefrom (Rubin 1978; Brick- 
ley and Dark 1987), but these issues are beyond the scope of this 
paper. 


Firm Value Changes Are Imperfect Measures of the 
CEO’s Choice of Actions 

The change in shareholder wealth is the appropriate measure of the 
principal’s objective in the CEO-shareholder agency relationship, but 
it is an imperfect measure of the CEO’s individual performance. 
Holmstrom (1979) argues that optimal compensation contracts for 
risk-averse CEOs should be based not only on the principal’s objective 
(i.e., change in shareholder wealth) but also on any variables that 
provide incremental information valuable in assessing the CEO’s un¬ 
observable choice of action. Examples of potentially informative de¬ 
terminants of incentive compensation include direct measures of 
CEO activity, accounting measures of firm performance, and mea- 
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sure* of “relative performance” based on other executives in the same 
industry or market. Unfortunately, the structure of the Holmstrbm 
model makes its conclusions irrelevant to most compensation con¬ 
tracts, including those of CEOs. His model assumes that the principal 
knows the utility function of the manager as well as the production 
function relating actions to expected outcomes. For CEOs this means 
that shareholders know with certainty all possible actions of the CEO 
and the distribution of outcomes of each action. In addition, share¬ 
holders must know the set of optimal CEO actions. It is unlikely that 
these conditions are often satisfied. 

More important, Gibbons and Murphy (1989) argue that basing 
compensation on potentially informative additional variables can be 
counterproductive because their use provides incentives for CEOs to 
devote effort to actions that do not increase shareholder wealth—a 
phenomenon that is not modeled in Holmstrom’s analysis. Account¬ 
ing profits, for example, may yield information that is valuable in 
assessing an executive’s unobservable actions. But paying executives 
on the basis of accounting profits rather than changes in shareholder 
wealth not only generates incentives to directly manipulate the ac¬ 
counting system but also generates incentives to ignore projects with 
large net present values in favor of less valuable projects with larger 
immediate accounting profits. 

Table 6 reports coefficients of regressions of the change in salary 
plus bonus on changes in shareholder wealth, changes in shareholder 
wealth in the industry and market, and two accounting measures of 
performance: changes in accounting profits and changes in sales. We 
focus on the CEO’s compensation and ignore changes in the value of 
his options or stockholdings because these latter components are de¬ 
termined exclusively by firm performance, independent of other vari¬ 
ables such as relative performance and accounting profits. Thus if 
other variables are more important than shareholder wealth changes 
in providing CEO incentives, their importance should show up in a 
strong relation with CEO compensation. 


Relative Performance 

Basing CEO compensation on performance measured relative to 
aggregate performance in the industry or market provides CEOs with 
incentives to increase shareholder wealth while filtering out the risk- 
increasing effects of industrywide and marketwide factors beyond the 
control of executives (Holms trOm 1982). Column 1 of table 6 reports 
coefficients from a regression that includes firm performance mea¬ 
sured relative to the performance of other firms in the same indus¬ 
try as an additional explanatory variable. In particular, the net-of- 
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TABLE 6 

{ - 

Pay-Performance Sensitivity of CEO Pay Using Additional Performance 
Measures: Coefficients of Ordinary Least Squares Regressions of ^(Salary + 
Bonus) on Various Stock Market and Accounting Measures of Performance 


Recursion CocmciENTs’ 


Independent 

Variable* 

(i) 

(i) 

(5) 

(4) 

(5) 

Intercept 

31.5 

31.9 

32.5 

31.0 

32.8 

A(fthareholder wealth) 

.0000140 

.0000126 

.0000074 

0000120 

.0000074 


(7.5) 

(4.8) 

(4.5) 

(7 1) 

(4 4) 

A(wealth net-of-induitry)* 

- 0000012 
(-.7) 





A(wealth net-of-market) 1 


0000013 

(-4) 




Accounting profit*) 



000177 

07.2) 


.000187 

(15.7) 

A(salet) 




.0000122 
(7 2) 

- 0000034 

(-1.7) 

ft 2 

.0083 

0082 

.0449 

.0148 

0453 

Sample me 

7,7-17 

7,747 

7,721 

7.721 

7,721 


Note. —The aample it constructed from longitudinal data reported in fortes on 1,668 CEOs serving m 1,049 
firms. 1974-86 (-statistics are in parentheses. 

• The variables are all measured in thousands of 1986 dollars 

r The dependent variable is A{salary + bonus), measured in thousands of 1986 constant dollars The qualitative 
multi are unchanged when Aftotal pay) is used at the dependent variable. 

* A( wealth net-of-industry) is defined as (r, - i,)V,_ j, where r, is shareholder return, V,~ i is be ginning-of-period 

market value, and t, is the value-weighted return for ail other firms in the same two-digti industry. Similarly, 
A(weaith net-of-markei) is defined as (r, - where m, is the value-weighted return for all NYSE stocks. 


industry shareholder wealth change variable is defined as V',_ j (r, - «,), 
where r t and V t ~, are the inflation-adjusted shareholder return and 
beginning-of-period market value of the sample firm, respectively, 
and i, is the value-weighted inflation-adjusted rate of return in year t 
for all other Compustat firms in the same two-digit Standard Indus¬ 
trial Classification industry. Thus the industry variable measures the 
difference between the wealth change shareholders received and 
what they would have received had they invested in other firms in the 
industry instead of investing in the sample firm. Column 2 repeats the 
analysis using wealth changes measured net of market instead of net 
of industry, where the market return is the value-weighted return of 
all NYSE stocks. 

The shareholder wealth change coefficients in columns I and 2 of 
table 6 are positive and significant, indicating that firm performance 
continues to be an important determinant of compensation even after 
net-of-industry and net-of-market performance is controlled for. The 
net-of-industry and net-of-market variables are insignificant; there¬ 
fore it does not appear that relative performance is an important source 
of managerial incentives. While we find that pay changes are unre¬ 
lated to relative value changes, V,_ j (r, - i,), Gibbons and Murphy 
(1990) find that pay changes are significantly related to relative rates of 
return, r, - i,. 
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Accounting Measures of Performance 

Column 3 of table 6 reports estimated coefficients from a regression 
of change in CEO salary and bonus on change in net accounting 
income measured before extraordinary items. The estimated coeffi¬ 
cient of .000177 indicates that CEOs receive 17.70 for each $1,000 
change in annual income. The increased explanatory power (com¬ 
pared to col. 2 of table 1) indicates that changes in accounting income 
are an additional important determinant of pay changes. Since in¬ 
come is a flow rather than a stock, however, the implied pay- 
performance sensitivity for accounting profits is roughly comparable 
to the pay-performance sensitivity for firm value changes of 0.74y per 
$1,000 in column 3. Suppose, for example, that the market value of 
the firm is the capitalized value of future earnings and that earnings 
follow a random walk. Then, with a real discount rate of 5 percent, 
each $1,000 change in earnings corresponds to a pay change of 17.7 y 
and a firm value change of $20,000, or just under a penny per $1,000. 

Column 4 of table 6 reports estimated pay-performance coefficients 
from a regression that includes the change in firm sales as an addi¬ 
tional determinant of incentive compensation. The estimated coeffi¬ 
cient of .0000122 suggests that CEOs receive 1.2y for every $1,000 of 
increased firm revenues, implying a pay revision of $1,900 for each 
standard deviation change in sales (based on the median standard 
deviation for sales changes of $160 million), compared to pay revi¬ 
sions of $2,400 for each standard deviation change in shareholder 
wealth (based on an estimated pay-performance sensitivity of .000012 
and a standard deviation for wealth changes of $200 million). The 
explanatory variables in column 5 include both accounting measures 
of performance—changes in sales and earnings—and also include 
the change in shareholder wealth. The earnings change coefficient 
remains large and positive, indicating that CEOs receive pay raises of 
about 190 for each $1,000 change in income. The sales change 
coefficient in column 5 is negative and marginally significant, suggest¬ 
ing that, with income and firm value held constant, CEOs receive pay 
cuts of about one-third of a penny for each $1,000 increase in firm 
revenues. Finally, the shareholder wealth change coefficients suggest 
that, with earnings and sales held constant, each $1,000 change in 
shareholder wealth corresponds to a CEO pay change of three- 
fourths of a penny. 

The purpose of including additional variables in the regressions in 
table 6 is to analyze whether compensation is highly sensitive to vari¬ 
ables other than the change in shareholder wealth. The results in 
table 6 indicate that CEO compensation is related to changes in ac- 
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counting profits and sales but is unrelated to market and industry 
performance. While CEO pay appears to be about equally sensitive to 
accounting profits and shareholder wealth, the estimated magnitude 
of both effects is still small: the amount of CEO pay “at risk” for a $48 
million change in accounting profits (which is twice the median stan¬ 
dard deviation) is $9,000, or less than 2 percent of compensation for a 
CEO with median earnings of $490,000. 


Unobservable Measures of Performance 

The small relation between CEO pay and measures of market or 
accounting performance seems inconsistent with the fact that CEOs 
receive a large share of their total compensation in the form of ex¬ 
plicit incentive bonuses. The Conference Board (1984) reports that 
over 90 percent of all large manufacturing firms had bonus plans in 
1983, and 87 percent of firms with bonus plans paid bonuses for 1983 
performance. The median bonus award for CEOs in the Conference 
Board’s survey is 50 percent of base salary: over 20 percent of the 
surveyed firms report CEO bonuses exceeding 70 percent of salary. 

It is possible that CEO bonuses are strongly tied to an unexamined 
or unobservable measure of performance. If bonuses depend on per¬ 
formance measures observable only to the board of directors and are 
highly variable, they could provide significant incentives. One way to 
detect the existence of such “phantom” performance measures is to 
examine the magnitude of year-to-year fluctuations in CEO compen¬ 
sation. Large swings in CEO pay from year to year are consistent with 
the existence of an overlooked but important performance measure; 
small annual changes in CEO pay suggest that it is essentially unre¬ 
lated to all relevant performance measures. To test for the existence 
of such unobserved but important pay-performance sensitivity, we 
compare the variability of CEO pay to that of a sample of randomly 
selected workers. 

The data indicate that year-to-year fluctuations in CEO income are 
not much different from income fluctuations for conventional labor 
groups. Column 1 in table 7 presents the frequency distribution of 
inflation-adjusted annual percentage changes in CEO salary plus 
bonus for all CEOs listed in the Forbes surveys from 1974 to 1986. A 
third of the sample observations correspond to inflation-adjusted pay 
changes between 0 percent and 10 percent, and three-fourths of the 
observations reflect pay changes between —10 percent and 25 per¬ 
cent. Raises in salaries and bonus exceeding 50 percent account for 
only 4.4 percent of the sample, and pay cuts of more than 25 percent 
account for only 3.2 percent of the sample. Column 2 in table 7 
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Comparison op Pay Variability of CEOs and Randomly Selected Workers: 
Frequency Distribution of Annual Percentage Chances in Real Salary and 
Bonus and Total Pay for CEOs Listed in Forbes Compensation Surveys, 
1974-86, and Changes in Real Wages for Workers in the 
1975-80 Michigan PS1D 


Inflation- 

Adjusted 

Annual 

Percentages 

CEOs in Forbes Surveys, 
1974-86 

Workers in 
Michigan PSID 
Sample, 
1975-80’ 

(S) 

Salary 
+ Bonus 
(1) 

Total 

Pay* 

(2) 

More than 50% 

4.4 

6.3 

4.6 

25% to 50% 

9.4 

10.5 

6.8 

10% to 25% 

21.1 

21.3 

14.0 

0% to 10% 

32.3 

29.1 

34.0 

-10% to 0% 

21.9 

18.9 

28.6 

-25% to - 10% 

7.7 

8.9 

7.8 

Less than - 25% 

3.2 

5.0 

4.2 

Sample size 

8,027 

8,027 

10,247 

Standard deviation 

30.5 

49.3 

41.7 


• Toul pay typically include! salary, bonus, value of restricted stock, savings and thrift plans, and other benefits 
but does not include the value of stock options granted or the gains from exercising slock options 

f The wage change distributions for the PSID were made available to us by Ken McLaughlin and include 10,247 
male workers aged IB-50 reporting wages earned In consecutive periods 


summarizes the frequency distribution of the inflation-adjusted total 
pay (excluding stock options). Changes in CEO compensation exceed¬ 
ing ±25 percent account for only 21.8 percent of the sample obser¬ 
vations. 

Column 3 of table 7 presents the frequency distribution of annual 
inflation-adjusted percentage wage changes for managerial'and non- 
managerial workers in the Michigan Panel Study of Income Dynamics 
(PSID). These distributions were made available to us by Ken 
McLaughlin, who reports similar distributions for logarithmic wage 
changes (McLaughlin 1987). The subset of the PSID sample analyzed 
by him covers the years 1975-80 and includes 10,247 annual wage 
changes for male workers aged 18-59. The wage change distributions 
for the random sample in column 3 are remarkably similar to the 
wage change distribution for CEOs in columns 1 and 2. The standard 
deviation of percentage wage changes for the PSID sample is 41.7, 
compared to 30.5 and 49.3 for CEO salary plus bonus and CEO total 
compensation, respectively. There are a few minor differences that 
are interesting. Executives are less likely to receive real pay cuts than 
workers selected at random; CEOs receive cuts in both salary plus 
bonus and total pay 32.8 percent of the time, while the workers in the 
PSID sample received pay cuts 40.6 percent of the time. Executives 
are more likely to receive raises exceeding 10 percent than random 






top-management incentives 


851 

workers, 34.8 percent and 38 percent for salary plus bonus and total 
pay, respectively, for CEOs compared to 25.4 percent for all workers. 

Corporate management is an occupation in which, a priori, we 
would expect incentive compensation to be especially important. It is 
therefore surprising that the distribution of wage changes for CEOs is 
so similar to the distribution for randomly selected workers. It ap¬ 
pears that annual executive bonuses are not highly variable. These 
data seem inconsistent with economic theories of compensation: in 
spite of the fact that bonuses nominally amount to 50 percent of 
salary, there seem to be too few major year-to-year percentage 
changes in CEO compensation to provide the incentives that are likely 
to make a substantial difference in executive behavior. 


Direct Measures of Performance 

Incentive contracts are unnecessary when CEO activities are perfectly 
observable and when shareholders (or boards of directors) can tell the 
CEO precisely which actions to take in each state of the world. When 
their activities are imperfectly observable. CEOs will be evaluated in 
part by observing output (change in shareholder wealth) and in part 
by observing input (CEO activities). One explanation for the small 
pay-performance sensitivity is that boards have fairly good informa¬ 
tion regarding managerial activity, and therefore the weight on out¬ 
put is small relative to the weight on input. 

The hypothesis that corporate boards directly monitor managerial 
input is consistent with the data but inconsistent with generally held 
beliefs in the business and financial community. Outside members of 
corporate boards have only limited contact with the CEO—at most 1 
or 2 days a month—and the meetings that do occur are typically held 
in the CEO’s office with agendas and information controlled by him. 
More important, the hypothesis that “forcing contracts” can be writ¬ 
ten when managerial actions are observable hinges crucially on the 
assumption that shareholders or boards know what actions should be 
taken. Managers often have better information than shareholders 
and boards in identifying investment opportunities and assessing the 
profitability of potential projects; indeed, the expectation that man¬ 
agers will make superior investment decisions explains why share¬ 
holders relinquish decision rights over their assets by purchasing 
common stock. Basing compensation on observed managerial actions 
cannot provide CEOs with incentives to engage in value-increasing 
activities when the expected wealth consequences of alternative ac¬ 
tions are unknown to shareholders and board members. Appropriate 
incentives can be generated in these cases, however, by basing com¬ 
pensation on changes in shareholder wealth. 
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Nonpecuniary Rewards Provide Adequate Incentives 

Our estimates of the pay-performance sensitivity (with respect to 
compensation, stock ownership, and dismissal) include only monetary 
rewards for performance and ignore potentially important non¬ 
pecuniary rewards associated with managing a firm. These nonpecu¬ 
niary rewards could provide incentives for CEOs to take appropriate 
actions even when direct monetary incentives are absent. 

Nonmonetary rewards such as power, prestige, and honor will 
definitely affect the level of monetary compensation necessary to at¬ 
tract properly qualified people to the firm, but unless nonmonetary 
rewards vary positively with the value of the firm rfiey will not in¬ 
crease the CEO’s incentives to take appropriate actions (except 
through the threat of performance-related dismissal). Moreover, be¬ 
cause nonpecuniary benefits tend to be a function of position or rank, 
it is difficult to vary the amount of nonpecuniary benefits received by 
an executive from period to period to correspond with increases or 
decreases in productivity. It is therefore unlikely that nonpecuniary 
factors are an important source of incentives pushing managers to 
maximize value. 

Nonpecuniary rewards associated with success and accomplish¬ 
ment, and nonpecuniary punishments associated with failure, do pro¬ 
vide incentives for managers. However, these nonpecuniary incen¬ 
tives, generally associated with reputation in the firm and standing in 
the community, will motivate managers to act in shareholders’ interest 
only if the nonpecuniary rewards and punishments are directly asso¬ 
ciated with firm value changes. This is a serious problem because 
there are strong political and organizational forces that tend.to define 
success in dimensions other than shareholder wealth and exert pres¬ 
sures for actions that reduce firm value. Managerial conformance to 
pressures to maintain employment, peace with unions, or major con¬ 
tributions to communities by keeping unprofitable plants open can 
easily become synonymous with “success.” In such situations, the non¬ 
pecuniary rewards come at the expense of shareholder value and 
economic efficiency. 


External Forces Provide Adequate Incentives 

Compensation and termination policy are internal tools utilized by 
boards of directors to provide managerial incentives. There are also 
competitive forces external to the corporation that provide incentives, 
including competition in the product market (Han 1983), the mana¬ 
gerial labor market (Fama 1980), and the market for* corporate con¬ 
trol (Manne 1965). Product market competition disciplines managers 



top-management incentives 


2 53 

since firms that are inefficiently managed will be unprofitable and will 
not survive. Competition in the managerial labor market, especially 
the labor market internal to the organization, includes the incentives 
of subordinates to replace inferior superiors. The threat of takeovers 
also provides incentives since managers are often replaced following a 
successful takeover. Martin and McConnell (1988) report, for ex¬ 
ample, that 61 percent of target firm managers depart within 3 years 
after a successful takeover compared with 21 percent for a non- 
merged control sample, and Walsh (1988) reports that 37 percent of 
the entire top-management team leaves the target firm within 2 years 
of a takeover compared with 13 percent of a nonmerged control 
sample. 

Although these external forces provide incentives for existing man¬ 
agement, we focus on internal incentive mechanisms since these are 
under the direct control of boards of directors. Moreover, external 
forces such as takeovers may be a response to, instead of an efficient 
substitute for, ineffective internal incentives. 

III. Alternative Hypotheses 

The conflict of interest between managers and shareholders is a classi¬ 
cal agency problem, but the small observed pay-performance sensitiv¬ 
ity seems inconsistent with the implications of formal principal-agent 
models. Two alternative hypotheses consistent with the observed rela¬ 
tion between pay and performance are that (1) CEOs are not, in fact, 
important agents of shareholders, and (2) CEO incentives are unim¬ 
portant because their actions depend only on innate ability or compe¬ 
tence. There has not yet been careful empirical documentation of the 
ways in which CEOs affect the performance of their firms, but there is 
considerable evidence that the competence and actions of a CEO are 
important to the productivity of the firm. The fact that stock prices 
react significantly to the death (Johnson et al. 1985) or replacement 
(Warner et al. 1988) of CEOs, for example, is inconsistent with the 
hypothesis that CEOs do not matter. 

The wave of MBOs and the improved productivity they generate 
are consistent with the hypothesis that CEOs and the incentives they 
face are important to firm performance. There is strong evidence that 
the 96 percent average net-of-market increase in value associated with 
these buy-outs is caused by new top-management incentives (Jensen 
1989; Kaplan 1989). The experience with MBOs is inconsistent with 
the hypothesis that managerial incentives are unimportant because in 
these transactions the same top managers manage the same assets 
after the company goes private. Data from takeovers, which are asso¬ 
ciated with high management turnover and produce average in- 
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creases in firm value of 50 percent, are also consistent with the hy¬ 
pothesis that top-level managers can have a large effect on firm 
performance. 

Another hypothesis that we believe helps reconcile our empirical 
results concerns the important role of third parties in the contracting 
process. Managerial labor contracts are not, in fact, a private matter 
between employers and employees. Strong political forces operate in 
both the private sector (board meetings, annual stockholder meetings, 
and internal corporate processes) and the public sector that affect 
executive pay. Managerial contracts are not private because by law the 
details of the pay package are public information open to public 
scrutiny and criticism. Moreover, authority over compensation deci¬ 
sions rests not with shareholder-employers but rather with compensa¬ 
tion committees composed of outside members of the boards of direc¬ 
tors who are elected by, but are not perfect agents for, shareholders. 
Fueled by the public disclosure of executive pay required by the Secu¬ 
rities and Exchange Commission, parties such as employees, labor 
unions, consumer groups, Congress, and the media create forces in 
the political milieu that constrain the type of contracts written be¬ 
tween management and shareholders. 

The benefits of the public disclosure of top-management compen¬ 
sation are obvious since this disclosure can help provide a safeguard 
against “looting" by management (in collusion with “captive” boards 
of directors). The costs of disclosure are less well appreciated. Public 
information on “what the boss makes” affects contracts with other 
employees and provides emotional justification for increased union 
demands in labor negotiations. Media criticism and ridicule and the 
threat of potential legislation motivated by high payoffs to managers 
reduce the effectiveness of executives and boards in managing the 
company. The media are filled with sensational stories about execu¬ 
tive compensation each spring at the height of the proxy season. 
Board members are subject to lawsuits if top-management pay is “too 
high” relative to pay observed in similar firms (but never if it is “too 
low”). Since the subjective “reasonableness” of a compensation pack¬ 
age is strongly influenced by the political process, it is natural that 
well-intentioned but risk-averse board members will resist innovative 
incentive contracts. 

Strong public antagonism toward large pay changes is illustrated by 
the recent conflict leading to the defeat of congressional pay in¬ 
creases. National polls indicate that 85 percent of voters opposed the 
50 percent increase in congressional salaries (from $89,500 to 
$135,000) even though this increase would have left salaries lower in 
real terms than 1969 levels (Rogers 1989). 
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The Implicit Regulation Hypothecs: Evidence from , 
the 1930s 

It is difficult to document the influence of the political process on 
compensation since the constraints are implicit rather than explicit 
and the public disclosure of top-management compensation has ex¬ 
isted for half a century. One possible way to test this implicit regula¬ 
tion hypothesis is to compare our pay-performance results for 1974- 
86 to the pay-performance relation when regulatory pressures were 
less evident. We construct a longitudinal sample of executives from 
the 1930s using data collected by the U.S. Work Projects Administra¬ 
tion (WPA) in a 1940 project sponsored by the Securities and Ex¬ 
change Commission (1940-41). The WPA data, covering fiscal years 
1934—38, include salary and bonus paid to the highest-paid executive 
in 748 large U.S. corporations in a wide range of manufacturing and 
nonmanufacturing industries. Of the WPA sample firms, 394 are 
listed on the NYSE; market value data for these firms are available on 
the CRSP Monthly Stock Returns Tape. 

Comparing corporate data from the 1934-38 WPA sample to cor¬ 
responding data from the 1974-86 Forbes sample is difficult because 
of reporting differences and because of major secular changes in the 
number of corporations and the size distribution of corporations over 
the past five decades. The "CEO” designation was rarely used in the 
1930s, and therefore for comparison purposes we define CEO as 
the highest-paid executive. In addition, the WPA data do not reveal 
the name of the highest-paid executive, and therefore some salary 
and bonus changes reflect management changes rather than pay revi¬ 
sions for a given manager. For comparison purposes, the 1974-86 
pay change data utilized in tables 8 and 9 were constructed ignoring 
management changes. Finally, in order to compare similar firms in 
the two time periods, we restrict our analysis to firms that are in the 
top quartile of firms listed on the NYSE (ranked by market value). 
The WPA compensation data are available for 60 percent of the top- 
quartile NYSE firms for 1934-38 (averaging 114 firms per year), and 
Forbes compensation data are available for 90 percent of the top- 
quartile NYSE firms for 1974-86 (averaging 335 firms per year). 

Table 8 presents sample compensation statistics for CEOs in the top 
quartile of NYSE corporations ranked by market value for 1934-38 
and compares these results to similarly constructed data for 1974-86. 
The CEOs in the largest-quartile firms earned an average of $813,000 
measured in 1986 constant dollars in the 1930s, significandy more 
than the average pay of $645,000 earned by CEOs in the NYSE top 
quartile from 1974 to 1986. Over this same time period, median pay 
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TABLE 8 


CEO Compensation in 1934-38 versus 1974-86: Sample Compensation 
Statistics for CEOs in the Top Quartile of NYSE 
Corporations Ranked by Market Value 


Variable (in 1986 Dollars) 

1934-38 

1974-86 

Test Statistic 
for Difference 

CEO salary + bonus: 

Mean 

$813,000 

$645,000 

t = 

9.1 

Median 

Mean market value of firm 

$639,000 
$1.6 billion 

$607,000 
$3.4 billion 

t = 

- 6.1 

Mean CEO salary + bonus as a 
percentage of firm market 
value 

.110% 

.034%, 

t = 

29.6 

Change in CEO salary + bonus: 
Mean 

$31,900 

$27,800 

/ = 

.4 

Median 

Average standard deviation* 

$200 

$205,000 

$21,600 

$127,000 

/ = 

2.7 


Non.—For the 1934-38 data, CEO* arc defined as the highest-paid executive Sample sizes are 456 and 3,988 
CEO-years for the 1934-38 and 1974-86 samples, respectively 

* The standard deviation for A(*alarv + bonus) was Tabulated lor each firm with at least thiee years ol data, 
sample sizes are 108 firms and 436 firms foi the eat her and latei time periods, respectively I he /-statistic tests the 
equality of the average standard deviations in the two samples The samplewide (pooled) standard deviation of pay 
changes was $167,500 for 3,928 CEO-years for 1974-86. compared to $463,500 for 448 CEO-years for 1934-38 


fell from $639,000 to $607,000. The current popular belief that CEO 
pay in the largest corporations has increased dramatically over the 
past several decades is therefore not supported by these sample aver¬ 
ages. Over this same time period, there has been a doubling (after 
inflation) of the average market value of a top-quartile firm—from 
$1.6 billion in the 1930s to $3.4 billion for 1974-86. Along with the 
decline in salaries, this means that the ratio of CEO pay tctotal firm 
value has fallen significantly in 50 years—from 0.11 percent in the 
early period to 0.03 percent in the later period. The mean annual 
change in compensation in the earlier period was $31,900 as com¬ 
pared to $27,800 in the 1974-86 period. More important, the vari¬ 
ability of annual changes in CEO pay fell considerably over this 
period; the average standard deviation of the annual pay changes 
was $127,000 in the 1970s and 1980s, significantly lower than the 
$205,000 average in the 1930s. 

The pronounced decline in the raw variability of salary changes 
evident in table 8 suggests the possibility of a decreased sensitivity in 
the pay-performance relation. Table 9 reports estimated coefficients 
from regressions of change in CEO salary and bonus on this year’s 
and last year’s change in shareholder wealth. The 1930s regression 
indicates that each $ 1,000 increase in shareholder wealth corresponds 
to an 11.40 increase in this year’s pay and a 6.10 increase in next 
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TABLE 9 

CEO PafsPWiformance Sensitivity IN 19S4-38 versus 1974-86: Regressions of 
Change in CEO Salary + Bonus on Change in Shareholder Wealth for CEOs 
in the Top Quartile of NYSE Corporations Ranked by Market Value 


Regression Coefficients* 

Independent _ 

Variable 1934-38 1974-86 


Intercept 

6.3 

22.3 

A(shareholder wealth) 

.000114 

.000012 

(thousands of 1986 dollars) 

(5.6) 

(7.0) 

A(shareholder wealth) 

.000061 

.000007 

in year t - 1 

(2.8) 

(4.4) 

R 2 

.0702 

.0165 

Estimated pay-performance sensitivity, b 

.000175 

.000019 

Estimated cents per $1,000 

17.5« 

1.90 


Note.—F or the 1934-38 data. CEOs are dchned as the highest-paid executive Sample suet are 427 and 3,826 
CEO-years for the 1934-38 and the 1974-86 samples, respectively, /-statistics arc in parentheses. 

* Dependent variable is A(salarv + bonus), measured in thousands of 1986 constant dollars 


year’s pay; thus the total effect of a $1,000 increase in shareholder 
wealth is 17.50. In contrast, the regression using the 1974-86 data 
implies only a 1.90 pay change for each $1,000 change in shareholder 
wealth. Thus the pay-performance relation for CEOs in the top quar¬ 
tile of NYSE firms has fallen by a factor of 10 over the past 50 years. 
These results, although not conclusive, are consistent with the implicit 
regulation hypothesis because political constraints and pressures, 
disclosure requirements, and the overall regulation of corporate 
America have increased substantially ovfcr the same period. 

The; incentives generated by CEO stock ownership have also de¬ 
clined substantially over the past 50 years. Table 10 shows time trends 
in the stock ownership of CEOs for two different samples of firms. 
The first sample consists of all CEOs in the 120 largest firms (ranked 
by stock market value) in 1938, 1974, and 1984; we collected stock 
ownership data for these CEOs from proxy statements. Proxy state¬ 
ments for 1938 were available for only 53 of the largest 120 firms in 
1938; stock ownership data for CEOs in 16 additional firms were 
obtained using 1939 and 1940 proxy statements. 

Part A of table 10 shows that CEO percentage of ownership (in¬ 
cluding shares held by family members and trusts) in the largest 120 
firms fell from a median of 0.30 percent in 1938 to 0.05 percent in 
1974 and fell'further to 0.03 percent in 1984 (average percentage of 
ownership fell from 1.7 percent in 1938 to 1.5 percent and 1.0 per¬ 
cent in 1974 and 1984, respectively). In addition, the median dollar 
value of shares held (in 1986 constant dollars) fell from $2,250,000 in 
1938 to $2,061,000 in 1974 and to $1,801,000 in 1984. The decline in 
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TABLE 10 

Time Trend* in CEO Inside Stock Ownership: Median CEO Stock Ownership 
for Two Samples of Firms 


Sample and Year 

Median 

Value of Stock 

Owned (1986 Dollars) 

Median 
Percentage of 
Firm Owned 

A. 120 largest firms ranked by 
market value: 

1938 

$2,250,000 

.30% 

1974 

2,061,000 

.05 

1984 

1,801,000 

.03 

B. 73 manufacturing firms: 

1969-73 

3,531,000 

.21 

1974-78 

1,397,000 

.14 

1979-83 

1,178,000 

.11 

13-year sample 

1,697,000 

.16 


Note.—S tock ownership obtained from proxy statements includes not only shares held directly but also shares 
held by family members or related trusts. 


the value of shares held between 1974 and 1984 is especially signifi¬ 
cant since 1974 was a “bust” year in the stock market, while 1984 was a 
“boom” year. The value-weighted portfolio of all NYSE stocks in¬ 
creased by 113.4 percent (after inflation) over this interval, so if the 
median executive had maintained his stockholdings and if these had 
increased by the same percentage as that of the market portfolio, the 
value of his holdings would have increased from $2,061,000 in 1974 
to $4,400,000 in 1984 instead of falling to $1,801,000. 

Part B of table 10, based on the 73 manufacturing firm sample, 
shows the median value of stock owned by CEOs and their percentage 
of ownership for the full 15-year sample and for 5-year intervals. For 
1969-73, the median CEO in the 73 sample firms held $3,531,000 in 
common stock (1986 dollars), which accounted for 0.21 percent of the 
shares outstanding. By 1979-83, the median ownership had fallen 67 
percent to $1,178,000, accounting for only 0.11 percent of the shares 
outstanding. Over the same time period, the average stock ownership, 
which is strongly influenced by a few CEOs with extraordinarily large 
holdings, fell from $14,100,000 to $8,500,000. 

The political pressures associated with high pay-performance con¬ 
tracts do not appear to extend to gains from stock ownership. We 
therefore expect increases in political pressure to correspond to de¬ 
creases in pay-performance sensitivity and increases in incentives asso¬ 
ciated with stock ownership. The dramatic decline in CEO stock own¬ 
ership over the past 50 years is contrary to the implicit regulation 
hypothesis and suggests a significant downward trerid in managerial 
incentives that is not explained by existing theories. 
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Political Influence and the Effect of Firm Size on the 
Pay-Performance Sensitivity 

Political influence is likely to be more pronounced in large firms since 
larger firms tend to be more visible and more closely scrutinized than 
smaller firms (Watts and Zimmerman 1986, chap. 10). The implicit 
regulation hypothesis thus predicts that the pay-performance sen¬ 
sitivity declines with firm size, but our all-inclusive estimate of $3.25 
per $1,000 is based on a constant pay-performance sensitivity across 
firms. Although the Forbes sample analyzed in this paper includes the 
nation’s largest firms, the size distribution of firms within the sample is 
highly skewed. The average and median market values of firms in our 
sample are $1.73 billion and $810 million (1986 dollars), respectively. 
The average and median market values for firms larger than the 
sample median are $3.1 billion and $ 1.6 billion, respectively, while the 
average and median market values for firms smaller than the sample 
median are $400 million and $360 million, respectively. 

We test for the effect of firm size on the pay-performance sensitiv¬ 
ity by reestimating the results in tables 1, 3, 4, 5, and 6 for firms with 
market value in a given year above or below the sample median mar¬ 
ket value for that year. Of the 73 manufacturing firm sample (table 2), 
80 percent fall into the “above-median” category (on the basis of the 
Forbes sample); thus we did not reestimate the results in table 2 by firm 
size. Our overall results are summarized in columns 2 and 3 of table 
11; to save space, details of the estimates are not provided but are 
available on request. We have previously noted substantial differences 
in CEO stockholdings in small and large firms (table 3); table 11 
suggests other interesting differences between the two samples. Row 
1 shows that each $ 1,000 change in shareholder wealth corresponds 
to a 4.1e pay raise for CEOs in small firms, but only 2.0<! for CEOs 
in large firms. Also, current and past net-of-market performance 
is a strong predictor of CEO turnover in below-median-size firms, 
but performance and turnover are both economically and statis¬ 
tically insignificantly related for large firms. As reported in row 5, 
the average dismissal-performance sensitivity (weighted by the 
number of observations in each age group) is $2.25 per $1,000 
change in shareholder wealtU for CEOs in small firms, but only 
5( per $1,000 for CEOs in large firms. Our all-inclusive estimated 
pay-performance sensitivity (row 8) for small firms is $8.05 per 
$1,000, four times greater than our large-firm estimate of $1.85 per 
$ 1 , 000 . 

Varying degrees of political pressure across firms or decades are 
not of course the only potential explanations for the size effect or 
secular decline in pay-performance sensitivities; thus the evidence 
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TABLE 11 

Estimated Pay-Performance Sensitivity: Total Effects (over 2 Years) on CEO 
Compensation-related Wealth Corresponding to Each $1,000 Change in 
Shareholder Wealth for CEOs in Fokbes Sample, 1974-86, by Firm Size 


Predicted CEO Wealth 
Change her $1,000 

Chance in 

Shareholder Wealth 




Large 

Small 



All Firms 

Firms 

Firms 



(1) 

(2) 

(3) 

1 . 

Change in this year’s and next year's salary + 





bonus 

$ .022 

$ .020 

$ .041 

2. 

Total compensation + present value of the 





hange in salary + bonus 

.30 

.25 

.75 

3. 

Change in the value of stock options 

.15 

.15 

.15 

4. 

Change in direct pay-related wealth (row 2 + 





row 3)* 

.45 

.40 

.90 

5. 

Change in wealth due to dismissal from poor 





performance 

.30 

.05 

2.25 

6. 

Change in total pay-related wealth (row 4 + 





row 5) 

.75 

.45 

3.15 

7. 

Change in wealth related to stock ownership 





for CEO with median stockholdings 

2.50 

1.40 

4.90 

8. 

Change in all pay- and stock-related wealth’ 

$3.25 

$1.85 

$8.05 


Source —Row l table 1. col. 2 Row 2 table 1. tol 4. Row 3 table 2. col 1, estimated ioi the 73-firm sample We 
assume that the option-performance sensitivity is the same for both size groups Row 5 table 5. col 6 This is the 
weighted average of estimates for each age group Row 7 table 3. cols 1,2, and 3, Stot k ownership includes shai es 
held by family members and connected trusts Ownership also includes options that can be exercised within 60 days, 
thu» there is some “double counting" m rows 3 and 7 

Note, —Estimates are rounded to the neaicst nitkel (exiepi for row I) Laige funis have markei value in a given 
year above the Forbrs sample median for that year, while small firms have market value below the median Details of 
the estimates by firm size arc not provided in the text bu( are available on request 

* The direct estimate from the 73 manufacturing firms it only 3If (table 2. col 2). we have repotted thejargci 
estimate as an upper bound. 

Cols 3 and 5 of table 2 show that ft actional uockhuldings tan be added to other souices of incentives to 
construct an overall pay-performance sensitivity 


presented is supportive of the implicit regulation hypothesis but not 
conclusive. For example, higher pay-performance sensitivities for 
smaller firms could reflect that CEOs are more influential in smaller 
companies. A thorough empirical investigation of the implicit reg¬ 
ulation of executive compensation would be useful, but such an in¬ 
vestigation requires detailed data on the compensation practices of 
partnerships, closely held corporations, and other nonpublic organi¬ 
zations. These data are inherently difficult to obtain. In fact, it is 
precisely this asymmetry in data availability that forms the basis for 
the implicit regulation of executive compensation in publicly held 
corporations. 
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IV. Summary 

Our analysis of performance pay and top-management incentives for 
over 2,000 CEOs in three samples spanning five decades indicates 
that the relation between CEO wealth and shareholder wealth is small 
and has fallen by an order of magnitude in the last 50 years. Table 11, 
based primarily on the Forbes sample of 1,295 firms, provides an 
overview of our final results for the full sample and for firms with 
market value in a given year above or below the sample median mar¬ 
ket value for that year. In sum, our evidence yields the following 
conclusions. 

1. On average, each $1,000 change in shareholder wealth corre¬ 
sponds to an increase in this year's and next year’s salary and bonus of 
about two cents. The CEO’s wealth due to his cash compensation— 
defined as his total compensation plus the discounted present value of 
the change in his salary and bonus—changes by about 30<! per $1,000 
change in shareholder wealth. In addition, the value of the CEO’s 
stock options—defined as the value of the outstanding stock options 
plus the gains from exercising options—changes by 150 per $1,000. 
Our final upper-bound estimate of the average compensation-related 
wealth consequences of a $1,000 change in shareholder value is 45( 
for the full sample, 40y for large firms, and 90^ for small firms. 

2. Our weighted-average estimate of the CEO’s dismissal-related 
wealth consequences of each $ 1,000 shareholder loss for an average- 
size firm with - 50 percent net-of-market returns for two consecutive 
years is 300 for the full sample, 50 for large firms, and $2.25 for small 
firms. Therefore, the total pay-performance sensitivity—including 
both pay and dismissal—is about 150 per $1,000 change in share¬ 
holder wealth for the full sample (45tf and $3.15 per $1,000 for large 
and small firms, respectively). 

3. The largest CEO performance incentives come from ownership 
of their firms’ stock, but such holdings are small and declining. Me¬ 
dian 1986 inside stockholdings for 746 CEOs in th e Forbes compensa¬ 
tion survey are 0.25 percent, and 80 percent of these CEOs hold less 
than 1.4 percent of their firms’ shares. Median ownership for CEOs 
of large firms is 0.14 percent and for small firms is 0.49 percent. 
Adding the incentives generated by median CEO stockholdings to 
our previous estimates gives a total change in all CEO pay- and stock- 
related wealth of $3.25 per $1,000 change in shareholder wealth for 
the full sample, $1.85 per $1,000 for large firms, and $8.05 for small 
firms. 

4. Boards of directors do not vary the pay-performance sensitivity 
for CEOs with widely different inside stockholdings. 
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5. Although bonuses represent 50 percent of CEO salary, such 
bonuses are awarded in ways that are not highly sensitive to perfor¬ 
mance as measured by changes in market value of equity, accounting 
earnings, or sales. 

6. The low variability of changes in CEO compensation reflects the 
fact that in spite of the apparent importance of bonuses in CEO 
compensation, they are not very variable from year to year. The fre¬ 
quency distributions of annual percentage changes in CEO salary plus 
bonus and total pay are comparable to that of a sample of 10,000 
randomly selected workers. Thus our results indicating a weak rela¬ 
tion between pay and performance are not due to boards of directors 
using measures of managerial performance that are unobservable 
to us. 

7. Median CEO inside stockholdings for the 120 largest NYSE firms 
fell by an order of magnitude from 0.3 percent in 1938 to 0.03 per¬ 
cent in 1984. 

8. The average standard deviation of pay changes for CEOs in the 
top quartile (by value) of all NYSE firms fell from $205,000 in 1934- 
38 to $127,000 in 1974-86. 

9. The pay-performance sensitivity for top-quartile CEOs fell by an 
order of magnitude from 17.5< per $1,000 in 1934-38 to 1.9g per 
$1,000 in 1974-86. 

10. The average salary plus bonus for top-quartile CEOs (in 1986 
dollars) fell from $813,000 in 1934-38 to $645,000 in 1974-86, while 
the average market value of the sample firms doubled. 

The lack of strong pay-for-performance incentives for CEOs indi¬ 
cated by our evidence is puzzling. We hypothesize that political forces 
operating both in the public sector and inside organizations limit 
large payoffs for exceptional performance. Truncating the upper tail 
of the payoff distribution requires that the lower tail of the distribu¬ 
tion also be truncated in order to maintain levels of compensation 
consistent with equilibrium in the managerial labor market. The re¬ 
sulting general absence of management incentives in public corpora¬ 
tions presents a challenge for social scientists and compensation prac¬ 
titioners. 
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We formalize a view of entrepreneurship in the spirit of Theodore 
W. Schultz. In this view, entrepreneurs are those individuals who 
respond to the opportunities for creating new products (and the 
like) that arise because of technological progress, for example. The 
theory has implications for entry and exit, specialization of labor, 
and business transfers. These business transfers correspond to, 
among other things, individuals changing jobs and sales of firms. 
Transfers are seen as a mechanism facilitating division of labor. We 
also discuss evidence on business transfers that occur through sales 
of firms. 


I. Introduction 

Substantial progress has been made in the analysis and measurement 
of entrepreneurship in agriculture. This important literature in- 
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dudes the pioneering research of Griliches (1957), Welch (1970), and 
Schultz (1975, 1980). A very similar view of the entrepreneur was 
employed in each of these studies and throughout this literature. 
More recently, Baumol (1986,1988) has successfully employed a simi¬ 
lar view in order to measure entrepreneurship at a more aggregate 
level. In this conception of entrepreneurship, risk bearing is not the 
defining characteristic of entrepreneurs (as it is in the theory of Kihl- 
strom and Laffont [1979]). Rather entrepreneurs are those individ¬ 
uals who respond to the opportunities for creating new products (and 
the like) that arise because of technological breakthroughs, for exam¬ 
ple. 

Given the impressive record of this view in guiding research, it is 
surprising that it has been all but ignored in the theoretical literature. 
This paper has two purposes. The first is to begin integrating this 
Schultzian entrepreneur into the mainstream theoretical literature. 
We accomplish this by (1) developing a formal model that captures 
the spirit of this entrepreneur and (2) characterizing the equilibrium 
of this model. The second purpose is to start exploring the usefulness 
of the theory in generating measurable implications. We present nu¬ 
merous implications below. 

II. Brief Description of the Theory and 
Its Implications 

The model has two key features. The first crucial assumption is that 
opportunities for developing new products repeatedly arise through 
time. This continual emergence of opportunities is central to Schultz’s 
discussion of entrepreneurship. 1 For Schultz, their source is the “dis- 
equilibria that are inevitable in the dynamics of modernization and 
economic growth” (1980, p. 439). There are many sources of these 
“disequilibria” (and hence opportunities), and they include those aris¬ 
ing from technical progress and demographic shifts in the popula¬ 
tion. Throughout this secuon, for the purposes of discussion, we shall 
assume that the source of new opportunities is technical progress. 

The second key feature is that we assume that individuals differ in 
their abilities to develop emerging opportunides. Our motivation for 
this assumption also comes from the entrepreneurship literature dis¬ 
cussed above. Numerous studies have shown that entrepreneurial 
ability can be enhanced through experience, training, schooling, and 


1 Pursuit of opportunities is also stressed by Rosen (1983a). For him, entrepreneur- 
ship is “exploiting tbonew opportunities that inventions provide, more in the form of 
marketing and developing them for widespread use in the econqmy than developing 
the knowledge itself” (p. 307). 
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improvements in health (these studies are reviewed in Schultz [1975, 
1980, 1989]). Given that there is considerable evidence on this point, 
it would be desirable to include investment in entrepreneurial ability 
in our theory. Suppose that all individuals were initially identically 
endowed in entrepreneurial ability and that investments were possi¬ 
ble. Then, as Rosen (1983i, p. 43) has argued, individuals would 
“have incentives to specialize their investments in skills and trade with 
each other for this reason.” As a result of specialization in invest¬ 
ments, the distribution of entrepreneurial abilities would become 
skewed (i.e., some individuals would invest in this skill and some 
would not). While we do not formally model this investment process, 
this discussion motivates our assumption that individuals differ in 
their entrepreneurial abilities. 2 

There are two tasks in the economy, developing products and pro¬ 
ducing products previously developed. For the purposes of discus¬ 
sion, we shall often describe these tasks as being undertaken by indi¬ 
viduals who are self-employed. However, we could have easily chosen 
to discuss the tasks as if they were performed by employees. There is 
no difference from the point of view of the theory developed here. 

When an individual spends time pursuing an opportunity, a new 
product is created. This task will be called the entrepreneurial task. 
We shall often refer to this process by saying that the person “starts” a 
new “business.” We shall assume that the developer of a business can 
fully capture the benefits associated with the new product. The re¬ 
turns to product development will accrue primarily to the entrepre¬ 
neur when imitation is costly. This will be the case, for example, if an 
entrepreneur discovers that a certain location is well suited for a 
particular product. Once the entrepreneur has established a business 
at a particular location, it may be difficult to imitate that investment. 
Entrepreneurial activities in retailing are often of this form. Another 
example is when the entrepreneurial activity is the development of 
organization capital (see Prescott and Visscher 1980). The extreme 
opposite of our assumption of fully internalized returns is studied in 
Schmitz (1989), where all knowledge is a common property resource. 

When an individual spends time in the production of products 
previously introduced, we say that he undertakes the management 
task. For simplicity, it is assumed that all individuals are equally tal¬ 
ented in this activity. 

* While Schultz (1989) has argued that (1) the ability to develop new opportunities 
can be enhanced through investment and (2) specialization in investments is an impor¬ 
tant phenomenon, as far as we know he has not discussed the implication that some 
individuals will specialize in investing in entrepreneurial ability. Perhaps this is because 
of his desire to keep attention on the fact that many people engage in entrepreneurial 
activity. 
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We shall now discuss some of the theory’s implications. We shall 
describe only the “very broad” predictions in this section. Numerous 
other detailed implications along with the available empirical evi¬ 
dence will be discussed after the model’s formal development. 

Because of continual technological progress there will be constant 
pressure to develop businesses embodying new technology. Conse¬ 
quently, there will be a need to free up resources through the discon¬ 
tinuance of previously developed businesses. Therefore, the theory 
generates both entry and exit of businesses (i.e., the opening and 
shutdown of plants, the start and discontinuance of product lines, or 
the entry and exit of firms) and contributes to the literature on entry 
and exit, which includes Jovanovic (1982) and Ericson and Pakes 
(1987) (see also Hause and Du Rietz 1984; Hopenhayn 1986; Evans 
1987; Jovanovic and MacDonald 1988; Lambson 1988). 

Because opportunities for developing new products continually 
emerge and since there are differences in ability to develop busi¬ 
nesses, some individuals will specialize in entrepreneurial activities. 
The theory is therefore a model of occupational choice as well (see, 
e.g., Roy 1951). s While specialization is certainly a pervasive activity in 
modern economies, is specialization in entrepreneurship an important 
activity in these economies? We shall use our model to examine this 
issue by comparing the implications of our theory with those of an¬ 
other model with only a single-ability type in which no specialization 
occurs. As an example of different implications, an individual in the 
single-ability model would start a new business only if the person’s 
previously developed product has been discontinued. In our special¬ 
ization model, some individuals will engage in developing products 
even as businesses they previously developed continue to ~be man¬ 
aged. We shall present evidence that suggests that specialization in 
entrepreneurship is indeed important. 

Given that individuals specialize in jobs, the personnel of some 
businesses will change over time. That is, a particular business may be 
developed by one person and managed by another later on. We shall 
refer to this change in personnel as a “business transfer.” In the actual 
economy, business transfers will occur in numerous ways, depending 
on the institutional context in which they take place. For example, 
consider the following sequence of events. An individual “success¬ 
fully” develops a business. Rather than manage the business, he pur¬ 
sues another opportunity. Consider a few of the ways this take place. 
The entrepreneur could have been an employee when he developed 
the business and pursued the next opportunity as an employee of the 

5 Other models ot occupational choice in the industrial organization literature in¬ 
clude Lucas (1978) and Rosen (1982). 
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same firm. The business he developed might “move” from one divi¬ 
sion to another (e.g., from an R & D division to a production division) 
or he himself might move from one division to another. On the other 
hand, if his employment ended with the initial firm and he pursued 
the new opportunity at another firm, then the transfer would corre¬ 
spond to job turnover (as when Steve Jobs left Apple Computer to 
start Next). 4 Still another possibility is that the entrepreneur was self- 
employed and that he sold the business in order that he could start 
another. 5 The transfer in this case will correspond to a sale of a firm. 
While all these “movements” may appear quite different, they all 
serve the same purpose: to facilitate division of labor. Our theory 
applies equally to, and can be used to study, each of these situations. 

We give two examples here of specific predictions regarding busi¬ 
ness transfers. First, in a cohort of new businesses developed at a 
certain date, those that are subsequently involved in a transfer will on 
average be of higher “quality” (defined below) and also survive longer 
than those that are not transferred. Second, at a more aggregate level, 
economies that experience greater population growth will also have a 
higher rate of transfer activity. 

We shall discuss below evidence regarding these and other predic¬ 
tions on business transfers. Evidence on business transfers accom¬ 
plished through firm sales is more readily available than that for other 
business transfers. Consequently, we focus on firm sales in our discus¬ 
sion of evidence. 


III. Description of the Model 

A. The Physical Environment 

The model concerns a discrete-time, infinite-horizon economy. At 
each date there are three types of goods: a consumption good, capital 
goods (which we refer to as “businesses”), and the time of individuals. 
Time can be used to produce the consumption good by “managing” 
businesses. Time may also be used to develop one of the new opportu¬ 
nities that arrive over time; that is, it can be used to "start” a business. 
No output is produced during the development phase (which is one 


* In Rosen (1972), individuals go through a period of development. Firms differ in 
whether they provide development training to individuals or not. Here the situation is 
reversed, but in both cases a change in business operation is required. See MacDonald 
(1988) for more on job turnover, 

5 The idea that business sales facilitate specialization was anticipated to some extent 
by Penrose (1959). She recognized that "there are likely to be firms who want to 
withdraw from given lines of activity .. . (sincej new opportunities may have arisen as a 
result of developments within the firm or in the outside world” (p. 179). Arrow (1985) 
discusses related issues. 
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period in the model). Once development of a business is complete, 
output is forthcoming provided that someone uses his time to manage 
the business. If a person starts or manages a business, we say that the 
person “operates” the business. 

Individuals in the economy are assumed to maximize the expected 
sum of discounted utility (uncertainty enters the model in the start-up 
process). The discount factor is 3, while the period utility function is 
taken to be linear in the consumption good. Assuming linear utility 
allows us to focus on the market for businesses since there will be no 
trade of consumption goods in equilibrium. 6 As long as an individual 
is in the labor force (the issue of retirement is discussed shortly), he or 
she is endowed with an indivisible time unit each period. Individuals 
are assumed to have the same endowment of managerial ability, but 
they differ in their “start-up” or entrepreneurial ability, which is in¬ 
dexed by the parameter 0. 

The technology for production of the consumption good is a fixed- 
proportions technology. The two inputs, a single unit of time and a 
business, are combined to produce the consumption good. A business 
is rated by its quality or productivity q. If an individual allocates his 
unit time endowment to manage a business of quality q, then q units of 
the consumption good are produced. 

A business is developed when an individual allocates his unit time 
endowment to start-up. A unit time investment by a type 8 today 
yields a business next period of quality q T , where q r is a random vari¬ 
able with distribution F(q, 0) = Pr (q T s ^|0) for q E [0, q]. We assume 
that F is continuously differentiable. Let f(q, 0) denote the density. We 
assume that / is strictly positive on [0, q], Higher-0 types have more 
entrepreneurial ability in the sense that F(q, 0j) > F(q, 0 2 ) if 0 2 > 0j 
and q E (0, q). That is, we assume strict first-order stochastic domi¬ 
nance. 

Entrepreneurial ability will be valued if existing businesses become 
obsolete through time. One way to model obsolescence is to have 
business quality depreciate over time. Suppose that the realized out¬ 
come at time t + 1 of an entrepreneur's investment at time t is a 
business of productivity q. (Remember that no output is forthcoming 
during period t.) Then assume at time t + 1 + s, s E {0, 1,. . .}, that 
the productivity of the business is y'q, y 6 [0, 1], A smaller y means 
faster depreciation, y = 0 meaning that a business can be managed 
for at most a single period. In an alternative specification of the 
model, business quality, once realized, is constant over time. How- 



k)Spose that (1) utility were nonlinear, creating a preference for smooth consump- 
(2) businesses were rtquired to be owner operated; and (5) capital markets were 
rfect. Then a problem of liquidity constraints might arise (see Evans and Jovanovic 
19] for an analysis of entrepreneurship in the presence of liquidity constraints). 
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ever, because of exogenous technical progress, the average quality of 
new businesses increases over time. In this specification, the distribu¬ 
tion of quality outcomes for new businesses shifts by a multiplicative 
factor pal each period. This “technical progress” setup with p = y ~ 1 
and discount factor *y£ is formally equivalent to the depreciation 
specification with discount factor 3 . While the technical progress 
specification is perhaps our preferred interpretation, we discuss the 
depreciation specification for its notational simplicity. 

The identity of the person operating a given business may change 
over time. If such a change is made after the development period, the 
new operator must be trained to handle the idiosyncrasies of this 
particular business. We assume that this training involves a resource 
cost of a s 0 units of the consumption good. 7 In general, a training 
cost will have both a fixed component and a variable cost. For simplic¬ 
ity, we examine only the fixed component. Throughout we shall refer 
to a as the transfer cost. 

Finally, we specify the demographics of the economy. In order that 
we have no aggregate uncertainty, we assume that there are a con¬ 
tinuum of individuals of each type 0. Without loss of generality we 
assume that 0 is uniformly distributed on the interval [0, 1 ] (so 0 can 
be interpreted as the percentile ranking). We also assume that there is 
birth “into” and retirement “out of” the population of “active” work¬ 
ers. Specifically, let N, be the number of individuals who operate firms 
during period t. At the beginning of period t + 1, 67V, individuals 
retire while A7V, new individuals are born so that the size of the active 
labor force during period t + 1 is 7V, + j = (1 — 8 + k)N,. For simplic¬ 
ity wq assume that birth and retirement rates are identical across 0 
types and that the probability of retirement is independent of the 
individual’s age. 8 


B. Definition of Equilibrium 


We shall employ a competitive equilibrium concept. Because our 
model has a continuum of agents and no aggregate uncertainty, it 



7 In the technical progress specification of the model, in order for us to have the 
analogue of stationary equilibrium, the training cost must grow at rate p. This can be 
accomplished if the training involves a set number of labor units rather than consump¬ 
tion units, so that the training cost and wages grow at the same rate. This case of 
endogenous training cost is a straightforward extension of our analysis. 

8 A constant hazard rate is, of course, a very crude way to model departure from the 
labor force. Suppose that alternative opportunities, say in home production, arrive 
stochastically. Then we are assuming here that an individual takes advantage of an 
alternative opportunity whenever one arrives, regardless of, say, his ability parameter 
8. A richer model of departure from the labor force would include the individual’s state 
and income effects as determinants. 
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belongs to the class of anonymous sequential games studied by 
Jovanovic and Rosenthal (1988). Consequently, their framework 
could be used to prove existence in our model. However, rather than 
map our framework into theirs, we shall construct an existence argu¬ 
ment ourselves since this construction provides a characterization of 
the equilibrium. 

In the competitive equilibrium there will be prices for both labor 
and businesses. However, since the structure of ownership is irrele¬ 
vant, when deriving the competitive equilibrium we can assume with¬ 
out loss of generality that each business is owner operated. We 
thereby economize on notation by not explicitly considering the labor 
market. It is straightforward to derive equilibrium wage rates from 
our analysis. 

Our formal definition of equilibrium is in the spirit of Prescott and 
Mehra (1980). Let S, denote the economywide state variable at date t, 
which is a “list” giving the quality of businesses currently owned by 
each 0 type. The first object in the definition of equilibrium is a price 
function p(q\S t ) that assigns a price to each quality business (in units ol 
the current consumption good) given the current state S,. The second 
object is a transition function S that determines next period’s state 
given the current state, S, + 1 = S(S,). The third object is a policy 
function it determining the optimal action for an individual given 
that person’s state. An individual’s state is given by ( q , 0, /, S), where q 
is the quality of the business held by the individual at the beginning of 
the period (we call this the “current business”), and / is the person’s 
work status (/ = 0 means retiring; / = 1 means active this period). We 
shall describe the set of possible actions below, but briefly they are 
whether to manage one’s current business or to dispose of irthrough 
sale or discontinuance; and if disposing, whether to start or purchase 
a new business. An equilibrium is a price and transition function 
together with individual policy functions such that (1) if the price and 
transition functions are taken as given, the policy functions maximize 
expected utility; (2) supply of businesses equals demand for busi¬ 
nesses; and (3) the transition function that individuals take paramet¬ 
rically is also that transition law generated by aggregate behavior. 

We shall focus in this paper on stationary equilibrium. In our un¬ 
published working paper we prove a limited convergence result, 
showing that in the special case of y = 0 and a = 0 there is conver¬ 
gence to the stationary equilibrium. 


IV. Stationary Equilibrium 

A stationary equilibrium is an equilibrium for which $,+ 1 — S, = S. It 
is constructed in three subsections: subsection A determines the struc- 
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ture of prices, subsection B determines individual decisions given 
prices, and subsection C calculates the aggregate supply of and de¬ 
mand for businesses resulting from these individual decisions. The 
construction proves that there exists a unique stationary equilibrium. 

A. The Price of Businesses 

In stationary equilibrium there will be a “marginal business” q such 
that at each date businesses with quality above q will have positive 
price while businesses of quality below q will be in excess supply and 
have zero price. In equilibrium, an individual who buys a business 
must be indifferent about which quality he buys and, in particular, 
must be indifferent to purchasing q > q and the “free” business (j. For 
a business q > <}, let p(q, if) > 0 be the price that makes the individual 
indifferent between q and <J. (Note we can drop the dependence of p 
onS in stationary equilibrium.) For q& q,p(q,f) = 0. The price/* {q, f) 
does not include the transfer cost (this is without loss of generality). 
We shall formally calculate the equalizing differential condition after 
introducing some additional notation. 

B. Individual Behavior 

Fix the marginal business q and therefore the price p(q, f) of business 
q that satisfies the equalizing differential condition. This subsection 
analyzes individual behavior taking this price system as given. An 
individual’s decision at a given date depends on that person’s triple (q, 
0, /) at that date. A person retiring (I = 0) is forced to dispose of his 
business. We assume that a retiree can still consume, so he sells his 
business if p(q, q) > 0; otherwise he discontinues the business. A 
person who is active (/ = 1) either manages his current business or 
disposes of it (again, disposal means sale if p(q, <?)>() and discon¬ 
tinuance if p(q, cj) = 0); if he disposes of his current business, he 
chooses between starting a new business and purchasing a previously 
developed business to manage. We have the following theorem. 

Theorem 1. The optimal action of an active person depends oo his 
type 0 and the quality q of his current business as illustrated in fig¬ 
ure 1. 

As depicted in the figure, the set of 0 types is partitioned into three 
regions. The M region is the set of low-0 types. They have a compara¬ 
tive advantage in management and completely specialize in this activ¬ 
ity. The E region is the set of high-0 types who, with a comparative 
advantage in entrepreneurship, completely specialize in this activity. 
The J region is a set of intermediate-® types. They are “jacks-of-all- 
trades” who manage businesses they themselves started (though they 
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Fm. 1.— m R means retain and manage current business, m D means discontinue cur¬ 
rent business and purchase another to manage, s r means transfer current business and 
start another business, and % means discontinue current business and start another. 


may sell if their current business is of high enough quality). In the 
remainder of this section we shall explain this result in moredetail. 
The theorem is formally stated and proved in our working paper, 
which is available on request (most of the theorem* to follow are also 
proved in the working paper). 


1- The Value Function of an Active Person 

Let v(q, 0, denote the expected lifetime return to an active individ¬ 
ual of type 0 who begins the period with a business of quality q and 
makes optimal choices throughout his life. Suppose that the individ¬ 
ual decides to dispose of his current business q and faces the decision 
of whether to start a new business or to purchase a previously devel¬ 
oped business. Given that the current business q is being disposed of, 
the choice of whether to start or purchase is independent of q and 
depends only on "0. It should be clear that "low”-entrepreneurial- 
|types will choose to purchase, while “high ’-ability types will 
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choose to start. More precisely, there will be a 8, such that for 8 < ftj 
purchase is always better than start-up and for 8 > 8j start-up is 
always better than purchase. Let iF(q, \) (m for management) be the 
expected return to an individual when the option of business start-up 
is permanently removed from his choice set (note that v m is indepen¬ 
dent of 8). Let v\q, 8, 0) (s for start) be the return to an individual of 
type 8 when the option of buying a business is removed from his 
choice set. Since 8 < 6j would never start a business anyway, v(q, 8, 0) 
= v m (q, §) for such 8. Similarly, v(q, 8, Sj) = v\q, 8, q) for 8 > 8j. 


2. Analysis of v m 

In order to calculate v m , note that because of the transfer cost it will 
never be optimal in stationary equilibrium to sell a business and then 
purchase another. Hence an individual of type 8 ^ 8 , has two relevant 
choices: either manage the current business or discontinue the cur¬ 
rent business and purchase another. The value function v m (q, Q) is 
given by 

v m (q, $ = max{? + 3(1 - 8 )v m (yq, q) + $bp{yq, $), 

$ - a + 3(1 - b)v m (yc), $)}. 

The first payoff is the return to managing the current business. This 
action yields q units of the consumption good today and, because of 
depreciation, a business of quality yq tomorrow. This sells for p(yq, Q) 
tomorrow if he retires. The second payoff is the return to discontinu¬ 
ing the current business and purchasing q. Remember that because of 
the equalizing price differential, he is indifferent about which quality 
above q he selects; therefore we can assume that he buys Current 
consumption when is purchased is <J — a, which reflects the fact that 
the buyer pays the transfer cost. In the following period the individ¬ 
ual holds a yfj business, which he will be willing to manage if $ - a < 
yfj. Straightforward analysis of equation (1) shows that, as illustrated 
in figure 1 , the optimal policy for types 8 < 8 i is a cutoff rule defined 
by a number q™ £ [$ - a, 9 ] such that if q > q¥ the current business is 
managed, while if q < qt? the business is discontinued and another 
business is purchased. 

Having defined v m , we can now formalize the equalizing differential 
condition. The price an individual in the M region would be willing to 
pay for a business of quality q > Q rather than pay nothing for ^ is 
simply p(q, q) = v n (q, <J) - v m ($, Q). In the special case of -y = 0 (i.e., 
complete depreciation after one period of use), this reduces t op(q, $) 
= q - q. In the special case of a = 0, this reduces to p(q, $) = q — $ + 

&p(yq. b- 
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3. Analysis of V s 

In the calculation of v\ note that types 0 > 0] have only two choices at 
each date: either retain and manage their current business or dispose 
of the business and start another. Therefore, v s (q, 0, q) satisfies 


v\q, 0 , <?) = max {<7 + 3(1 - b)v s (yq, 0 , <?) 

+ (38 p(yq, q), p(q, <?) + 0r(0, q)}, 

where r(0, 4 ) is the expected receipts beginning next period resulting 
from start-up today: 

r( 0 , 0 * [V(x, 9)[(1 - h)v\x, 0 , q) + bp(x, q)]dx. 

Jo 


Let A (q, 0, 4 ) be the return from choosing start-up minus the return 
from retaining and managing (i.e., the second payoff in the max 
operator in eq. [2] minus the first payoff). For those types with the 
highest entrepreneurial ability, A is always positive in stationary equi¬ 
librium. Formally, there will be a 0 2 2 : 0 , such that A (q, 0 , 4 ) > 0 for all 
q if and only if 0 E E = (0 2 , 1]. As illustrated in figure 1, types 0 in the 
E region always start firms, selling them if q > q. 

If the transfer cost is strictly positive, then specialization in the 
economy will be incomplete in the sense that some types will manage 
and start businesses. Formally, 0 2 > 0j, with nonspecialist types lying 
in/ - (0i, 0 2 ). Consider the optimal policy for a type 0 EJ. For small q, 
it is not worthwhile managing the current business. The individual 
discontinues the current business and starts a new one. If the quality 
of the current business is increased, there will be a point q t ( 0 ) at which 
it does pay to manage the current business. If the quality of the 
existing business is even higher, there may be a quality qr/ift) a* which 
it pays to sell the business and start another . 9 What is the intuition? 
Firms of quality q E (q L , q H ) are of high enough quality to be managed 
but are not of such quality that they should be transferred. The costs 
involved in such a transfer outweigh the benefit of “freeing up” the 
individual's time. However, for businesses of quality q > q H , the calcu¬ 
lation is reversed. For such quality, transfer is efficient since these 
high-quality businesses are expected to last a long time and the fixed 
cost of the transfer can be spread out over a number of periods (recall 
that transfer cost is independent of quality). These efficiency consid¬ 
erations will be reflected in the price system. The only remaining issue 
is the slope of the q L and q H curves in figure 1. The greater the 
entrepreneurial ability 0 , the greater the benefit to pursuing entre¬ 
preneurial rather than managerial activities. Hence, the start-up re- 



Ifyq > $, then <)h{ 8) < (f for a range of 8 in the J region. 
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gions lie to the right of the q H and the qi. curves* or, equivalently, thie 
q H and q L curves are, respectively, downward and upward sloping. 10 

4. Comparative Statics of Individual Behavior 

How does individual behavior change as j varies? Increasing the qual¬ 
ity of the “free” firm q is equivalent to shifting the price distribution 
“down"; businesses must sell for less if anybody is to buy them when ^ 
is higher. Therefore, for higher individuals tend to shift away from 
business start-up to management. As functions of 6i(^) and 0 2 W) are 
increasing, q t (0, <?) is decreasing and q H (0 , cj) is increasing, and qc(<}) is 
increasing. The monotonicities are strict when evaluated at interior 
points. 


C. Aggregate Behavior 

Consider calculating aggregate supply and demand. Note that busi¬ 
nesses flow from the J and E sectors to the M sector and that there is 
also trade within the M sector due to retirements. Because of the 
equalizing differential condition in prices, we can speak of the supply 
of, and demand for, businesses of quality above q. This permits aggre¬ 
gation along a single dimension. As it turns out, it is easier to consider 
the related concepts of “gross demand” and “gross supply.” For each 
q , let D(q) ~ 0i(^) be gross demand. If we normalize the current 
population size to unity, 0 1 (^) is the “size” of the M sector (recall that 0 
is uniformly distributed on the unit interval). Therefore, gross de¬ 
mand is simply the total “number” of individuals who will manage a 
business that has been purchased either at the current date or at some 
earlier date. 11 Gross supply S(^) at ij equals the sum of two parts. The 
first part is the stock of businesses currently in the M sector that will 
be managed in the M sector in the current period. Included in this 
category are the businesses currently held by active members of the M 
sector that satisfy the q™ cutoff plus the businesses currently held by 
retiring members of the M sector that satisfy the $ cutoff. The second 
part of gross supply is the flow of businesses in the current period 
from the J and E sectors to the M sector. This category includes (1) 
the businesses that surpass the cutoff for sale by an active person (i.e., 
surpass the ^(0) or ^ cutoff depending on whether 0 is in the J or E 


10 We are ignoring a technical issue here that is treated formally in our working 
paper. At a finite number of 8 points in (8|, 0 a ), q H is vertical. Thus q H is technically a 
correspondence rather than a function. 

11 We use the term “number” throughout the discussion, but since there is a con¬ 
tinuum of individuals we formally mean “measure.” 
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region) and ( 2 ) the businesses held by retirees that surpass the q 
cutoff. 

In order to calculate gross supply, we first need to know what 
fraction of type 0 individuals develop businesses each period. Denote 
this fraction by w{ 6 ) (it is calculated in the Appendix). This fraction is 
constant in stationary equilibrium and satisfies 0 < h/( 0) < 1 for 0 in J 
and u/(0) = 1 for 0 in E. Because there is no aggregate uncertainty, 
once we know w(6) the distribution of business qualities is determined. 
Given the policy of individuals (as illustrated in fig. 1), we can deter¬ 
mine the distribution of businesses of each vintage and how busi¬ 
nesses flow across sectors. 

Gross supply 5 (<?) is explicitly calculated in the Appendix, where it 
is shown to be strictly decreasing in Gross demand strictly increases 
with <j. Both functions are continuous. Furthermore, at = 0 supply 
exceeds demand, while at ^ — q demand exceeds supply. Thus there 
is a unique ^ that equates gross demand and supply, and associated 
with this point is a unique distribution of businesses. We have the 
following theorem. 

Theorem 2. There exists a unique stationary equilibrium. 

We shall divide the study of the model’s implications into two sec¬ 
tions. In Section V, we examine the predictions regarding the dynam¬ 
ics of the business population within an economy. In Section VI, we 
examine aggregate measures of entrepreneurship. 

V. Dynamics of the Business Population 

As mentioned in Section II, the model has implications for entry and 
exit, specialization, and business transfer. We examine each in turn 
(the next section follows a similar format). As also mentioned, we shall 
sometimes compare the implications of our model with that of a sin¬ 
gle-type model (STM). In this model all individuals have the same 
ability, F{q, 0) = Fo(q) for all 0. We shall distinguish the STM from the 
general model with heterogeneous ability types by referring to the 
latter as the division of labor model (DLM). 

A. Entry and Exit 

A question of substantial interest in industrial organization is how the 
probability of business exit depends on business characteristics, for 
example, business size and age. Let /* be an indicator variable for 
business survival, /* = 1 denoting the event “the business is still in 
operation at the beginning of period t + k” (where t is the current 
period) and /* = 0 denoting the event “the business is discontinued 
before t + k Among other variables, the probability of survival can 
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be conditioned on age (the number of periods since development) 
and size (the productivity level q of the business during the current 
period). In the special case of the STM, Pr(/* = 11<?) is nondecreasing 
in q while Pr(/* = 1| q, age) is independent of age. These properties 
hold in the active-learning model of Pakes and Ericson (1988). 

Consider first the probability of business survival as a function of 
the entrepreneurial ability 8 of the individual currently holding the 
business, conditioned on size q. Good proxies for 8 (e.g., education 
and experience) have been found in the literature on agricultural 
entrepreneurship, so it may be possible to empirically calculate such 
probabilities. In order to make the analysis interesting, assume that q 
satisfies q^ < y k ~ 'q < Q, so survival or demise by date t + k is not 
assured. For the case of q > Pr(7* = l\q, 8) as a function of 8 is 

illustrated in figure 2. Suppose that q is held by a 8 in the M region. 
The business q will depreciate below Q after a certain number of 
periods. The probability that the person holding the business at that 
date does not retire over the remaining pjeriods is Pr(/* = 1|</, 8). This 
probability is independent of 8. For 8 EJ, Pr(/* = l|<jf, 8) is the same as 
that for 8 € M until the 8 GJ who would discontinue the business a 
period before types in the M region would do so, that is, 8 satisfying 
7 ?z.(8) = qi*- The probability is a decreasing step function till the 
horizontal line drawn at q in figure 1 intersects the q H curve. For 0 
above that p>oint, the business is transferred in the current period to 
the M region, so the probability jumps back up to its level in the M 
region, creating a U-shaped relationship. 

Figure 2 implies that relationships that are not conditioned on 0 will 




280 journal of political economy 

be complicated. Consider Pr(/* — 1|<?). Observing a high level of q may 
very well indicate that the current operator has high 6 . Figure 2 says 
that, with q fixed, over some range higher levels of 6 imply a lower 
probability of survival. Because of this, examples can be constructed 
in which Pr(/* = 1 \q) strictly decreases in q for some range of q. 
Consider next Pr(/* = 1| q, age). Because of depreciation, with q fixed, 
the older the business, the higher the initial quality of the business, 
and therefore the more likely the founder has high 0. But figure 2 
indicates that, conditioned on q, the survival probability is nonmono¬ 
tonic in 0. Therefore, Pr(/* = 1| q, age) may be nonmonotonic in age. 

Finally, consider how survival probabilities vary with age when un¬ 
conditioned on business quality; that is, consider Pr(/* = l|age). Be¬ 
cause of depreciation, businesses are eventually discontinued. This is 
a vintage effect and suggests that Pr(7* = 1 |age) decreases in age (see 
Chari and Hopenhayn [1988] and Benhabib and Rustichini [1989] for 
recent vintage models). However, there is a selection process in this 
model that has the opposite effect. The point can be made most easily 
with the special case of no depreciation, y = 1 (note that entry and 
exit still occur in stationary equilibrium if there is net growth in popu¬ 
lation). Consider a cohort of entering businesses. Those in the cohort 
with realized productivity below the cutoff 0 / ( 0 ) of their developer will 
be discontinued in the period immediately following development. 
Those with quality q E [ 0 / (0), <?] will survive till their developer retires. 
As time passes, businesses in the cohort with productivity below q are 
gradually discontinued and the probability of exit of a surviving 
member approaches zero. Hence, Pr(/* = l|age) increases with age. 

B. Specialization of Labor 

In the DLM, some individuals specialize in developing businesses. 
Consequently, they are involved in numerous start-ups. But some 
individuals in the STM also engage in many start-ups because start¬ 
ups often fail. As mentioned in Section II, the models do differ in that 
a person starts a second business in the STM only if his first business 
has been discontinued, while in the DLM a person may start new 
businesses even as previous businesses are managed. More formally, 
define n( 8 ) as the “average” number of businesses operated during 
the period that were started by a person of ability 0. Then n(0) = 1 for 
all 0 in the STM. In the DLM, n( 8 ) = 0 for 0 E Af, n( 8 ) > 1 for 0 £_/, 
and n(0) > 1 for 8 E E. Furthermore, n(0) is strictly increasing for 0 E 
E. If transfer cost is zero, the J region disappears and n(0) is nonde¬ 
creasing in 8 over the entire range of 8. 12 A related prediction is that 

l> While this is an "unlikely" outcome, it is theoretically possible for n(8) to decrease 
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in the STM a person transferring a business is necessarily retiring. In 
the DLM, an individual transferring a business may do so to free up 
time to start a new business. 

Consider the following evidence on specialization. First, there is 
much anecdotal evidence that supports the DLM. An extreme ex¬ 
ample is the case of William von Meister, who at age 45 has already 
started nine businesses; the successful companies were all sold for 
millions of dollars within 2 or 3 years of start-up (see “Starting Over,” 
INC. [May 1987], pp. 19-20). More formal evidence is provided by a 
recent study of new firm creation in Minnesota and Pennsylvania 
(Reynolds 1988). In this study, entrepreneurs who started new firms 
were asked about their previous experience. Of the 3,100 entrepre¬ 
neurs surveyed, 40 percent had previously started a business (and 
hence were involved in their second start-up), while 10 percent had 
previously started three or more businesses. Unfortunately, the entre¬ 
preneurs were not asked if their previous firms were still in existence. 
While this survey result indicates that some individuals repeatedly 
engage in developing firms, the evidence does not refute the STM 
because of the well-known high failure rates for new firms. However, 
more convincing evidence is found in Harris (1967). In his study of 
Nigerian entrepreneurship, he found that some individuals engaged 
in new start-ups even as previous firms continued to be operated. In 
fact, he found that the number of such businesses was positively re¬ 
lated to the entrepreneurs’ education. If we allow that education is a 
good proxy for 0, as has been found in the agricultural entrepre¬ 
neurship literature (see, e.g., Welch 1970), Harris finds that n is in¬ 
creasing in 0. Also relevant is Ronen's (1983) empirical investigation 
of entrepreneurship. In his surveys, he finds that entrepreneurs put a 
premium on pursuing novel and innovative projects; the entrepre¬ 
neurs tended to “pass on” responsibilities of successful businesses as 
they “matured.” 

Further evidence is found in a survey by the Commerce Depart¬ 
ment of 1,650 business owners who had sold (1,050) or discontinued 
(600) firms in 1946 (see Ulmer and Nielsen 1947). Of the individuals 
who had sold firms, only 38 percent did so because of “retirement, 
illness or other.” There is also indication that some of these other sales 
served to facilitate specialization of labor. For example, 20 percent of 
those who sold firms chose “alternative opportunity” as the reason. 
The other categories of “dispose of at a profit” and “avoid loss” could 


in the J region. This can occur only if w( 8) declines in 8. Though higher 0 are more 
selective about the kinds of firms they manage, they also start firms of higher average 
quality. If the second effect dominates the first, higher-0 types may end up spending a 
>«tialler fraction of their lifetime developing firms. 
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very well include transfers motivated by specialization of labor. Fur¬ 
ther support for the DLM comes from the question asking the indi¬ 
viduals about their current status (the survey was conducted about 4 
months after the disposal of the business). Here 29 percent of those in 
the survey (Ulmer and Nielsen report these responses for the pooled 
group of those who sold and discontinued) said that they were already 
“in another business.” 

C. Business Transfers 

We shall discuss three results relating to business transfers. Then we 
shall present available evidence on each. 

1. The Timing of Transfers 

The model has an implication for the timing of business transfer in 
the life cycle of a business. Consider the special case of a = 0. In the 
DLM there is complete specialization. In the period after develop¬ 
ment, the developer of a business either transfers or discontinues the 
business. Hence, all age 1 businesses that survive are transferred. For 
businesses of age greater than one, the probability of transfer, con¬ 
ditioned on survival, falls to 8 < 1. Thus in the DLM, young busi¬ 
nesses are more likely to be transferred than their older counterparts. 
By contrast, in the STM the probability of transfer, conditioned on 
survival, is independent of age. 


2. Quality of Transferred Businesses 


Consider the cohort of all businesses developed at a particular period, 
say period 0. If we normalize the population of active individuals 
at period 0 to one, No = 1 , the size of this entering cohort is Co = 
fj+ E w(6)dS (where a plus sign means “union"). Define G(q) as the 
cumulative distribution of business productivities in the cohort at age 
1 (the period after development), so that G(q) = [/y +E w(Q)F(q, 0)d0] - 
Co 1 • One of three events can happen to each business in the period 
after development. Let x* E {T, D, M} denote the event, where x\ = T 
means the business is transferred, x\ = D means it is discontinued, 




and X\ = M means it is managed by its original developer. Define 
G(^|x]) as the distribution of business quality at age 1 , conditioned on 
event x t . The fate of a business at age 1 depends on its realized 
productivity q and the characteristics of its developer (his 0 and retire¬ 


ment status). The distribution G(q\xi) is calculated by using figure 1. 
-Sin order to compare these probabilities we introduce a stronger as¬ 
sumption on F. Assume that f(q, %')/f(q, 6) is nondecreasing in q for 0' 
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> 8. This is known as the monotone likelihood ratio property 
(MLRP), and it implies the earlier stochastic dominance assumption 
(see Milgrom 1981). We have the following result, which is true in 
both the DLM and the STM. 

Theorem 3 . Under the MLRP, the distribution of q conditioned on 
transfer at age 1 dominates the distribution conditioned on operation 
by the original developer, that is, G^jxi = T) s Gfyjxj = M). 

Businesses that are transferred when young (i.e., at age 1) are on 
average better than those that are managed by their original devel¬ 
oper. Two factors are at play. First, transfer indicates that the devel¬ 
oper has high entrepreneurial ability 0. In figure 1, with q fixed, the 
greater 0, the greater the likelihood of transfer. And the higher the 
developer’s ability, the greater the quality (on average) of the busi¬ 
ness. The second factor is that transfer directly provides information 
that business quality is high. That is, if a business is transferred, we 
know that its quality must exceed q, while if a business is managed in 
the J sector, its quality need exceed only q t . 

3. Survival of Transferred Businesses 

Another related issue is the length of time a business survives. Define 
Pr(/„ = l|xi) to be the probability that a business is not discontinued 
before age a, conditioned on event X). 

Theorem 4. Again under MLRP, for all a, Pr (/„ = l|x| = T) s Pr(/ a 

- Ik = M). 

Businesses that are transferred when young are more likely to sur¬ 
vive than those that are not. There are two explanations. First, from 
theorerii 3, transfer indicates initial high quality, and a business with 
high initial quality tends to last. Second, since qi* < individuals 
in the M sector (who receive transfers) are less likely to discontinue a 
business of given quality than a person in the J sector. 15 

4. Evidence on Business Transfers 

We have some limited evidence regarding each of these implications. 
For example, in their study of the first 8 years of the 1978 cohorts of 

19 This is a good point at which to discuss our assumption that individuals have 
identical managerial ability. Would any of the three results above change if there were 
differences? Let us examine the extreme case in which individuals do not differ in 
entrepreneurial ability but do differ in managerial ability. For example, suppose that 
the output produced when a person of type 8 manages a quality q business is 8 q. The 
Ant and second results continue to hold (businesses will be transferred in the period 
after development, and transferred businesses have above-average quality). However, 
the third result concerning survival may not necessarily hold. Higher-quality businesses 
will be transferred to types with higher managerial ability who will be more likely to 
discontinue a business of any given quality. 
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retail and manufacturing firms in Wisconsin, Pakes and Ericson 
(1988) found that the vast majority of transfers occurred in the third 
period of each cohort’s life. Almost no transfers occurred after this 
point. We believe that this timing pattern lends further tentative sup¬ 
port to the DLM and suggests that a period in our model corresponds 
to roughly 3 years. Regarding evidence on quality, a main point of a 
book by Ravenscraft and Scherer (1987) is that acquired lines of busi¬ 
ness are of higher quality (usually measured by profitability) than 
those not acquired . 14 As for the survival patterns of firms, there is an 
old study by Churchill (1955) on this issue. She found that transferred 
firms had a greater life expectancy . 15 


VI. Aggregate Measures and Their Determinants 

In this section, we define and discuss measures of the extent of en¬ 
trepreneurial activity in the economy. For simplicity, we assume 
throughout this section that the transfer cost a is zero, unless other¬ 
wise mentioned. With zero transfer cost there is complete specializa¬ 
tion. The fraction of active individuals managing firms is 6 ,, and the 
fraction starting firms is 1 - 8 | (recall that ability is uniformly distrib¬ 
uted on the unit interval). 


A. Entry and Exit 

The aggregate rates of business entry and exit equal the ratios of the 
numbers of businesses started or discontinued during a period to the 
beginning-of-period number of businesses. The entry rate R F equals 
the fraction of individuals starting businesses with a correction for net 
change in population, R F = (1 - 8 + X) ■ (1 - 8i). The exit rate R\ 
can be derived by noting that the net growth in the number of busi¬ 
nesses equals the net growth in the number of active individuals, R t 
- R x - X - 8. A trivial implication is that, to the extent that econo¬ 
mies vary in dimensions other than net growth X - 8, there will be a 
positive correlation across economies between entry rates and exit 
rates. 

1. Growth of New Labor Resources 

Theorem 5. An increase in X decreases q and 0| , increases R F , and has 
an ambiguous effect on R x . 

14 Lichtenberg and Siegel (1987) find that plants that are sold grow faster in produc¬ 
tivity than compatible plants that are not sold. See also Brown and Medoff (1987). 

15 We should tiote that Churchill's definitions of firm birth and death differ from 
ours. She defined mortality to include transfer as well as discontinuance. She defined 
firm birth to include transfer as well as start-up. 



entrepreneurship 


*85 

An increase in the birth rate increases the ratyo of the number of 
active individuals to the stock of previously developed businesses. 
This increases demand relative to supply, and hence business prices 
increase (q declines). This increase in the returns to entrepreneurship 
spurs entry. 16 With higher business prices, a business of any given 
quality is less likely to be discontinued, and this tends to decrease the 
exit rate. However, an increase in X can increase the exit rate if y is 
close to one. When depreciation is insignificant, a young business (i.e., 
a business of age 1) is on average of lower quality than older busi¬ 
nesses that have undergone the selection process that occurs after 
development. An increase in X shifts the age distribution toward 
young businesses and, hence, can increase the exit rate. 

2. The Pace of Product Change 

A central theme of the literature on entrepreneurship in agriculture 
is that an increase in the rate of change (which means an increase in p 
in the model since p is the inverse of y) increases the returns to 
entrepreneurship relative to the returns to managerial activities and 
hence increases the resources devoted to the activity. We find this to 
be true in our model (though stronger conditions are needed). 

Theorem 6 . Assume that (i) F satisfies MLRP, (ii) F satisfies/^, 
0)/[l - F(q , 0)] is nondecreasing in q for all 0, and (iii) (1 - 5 + X) • p 
s 1. Then an increase in p decreases q and 0 ] and increases/?*; and R x - 

Restriction ii is a regularity condition on the hazard rate of /•'(■, 0). 
Condition iii requires that net population growth be below the real 
interest rate. Examples can be constructed that show that a bound on 
X is required for this result. 

B. Specialization of Labor 

One measure of the degree of specialization in the economy is the size 
of the J region; the smaller the region, the greater the specialization 
of labor. It is easy to show that 0] increases and 0 2 falls as a increases. 
Therefore, the degree of specialization falls with an increase in the 
transfer cost. Another determinant of specialization is, of course, the 
degree of heterogeneity in ability (i.e., the degree to which F(-, 0) 
varies with 0). In this paper we have taken the degree of het¬ 
erogeneity to be fixed. However, we feel that a particularly fruitful 
line for future work is to permit investment in this ability. As men¬ 
tioned in Section II, diversity of skills will arise endogenously if ability 

16 The result can be extended to the case a > 0 and 8 = 0. We conjecture that the 
result is true for the general case a > 0 and 8 > 0, but the complexity of the calculations 
makes this difficult to verify. v 



2 g6 JOURNAL OF POLITICAL ECONOMY 

can be enhanced through education or experience (Rosen 19836). 
Imagine for the moment that investment is permitted in the model. 
Consider some comparative statics in such a model, in particular, the 
effect of a reduction in 8, The incentive for investment in human 
capital is increased because the “fixed” cost of the investment is now 
spread out over a greater (expected) number of periods in the work 
force. This leads to greater investment and hence more specialization. 
The logic of this result parallels the famous result that larger market 
size permits greater specialization (Smith 1776). 

C. Business Transfers 

The transfer rate R T in the economy is the fraction of all businesses in 
existence at the beginning of a period that are transferred during the 
period. It is useful to express the transfer rate as the sum of two parts, 
the first corresponding to young businesses (i.e., those developed in 
the previous period) and the second to established businesses (i.e., 
those of age 2 or greater). Define R esl s to be the fraction of the begin- 
ning-of-period businesses that are established and that survive this 
period. The transfer rate is then 

Rt - f [1 - F(q, 6)]de + 8(3) 
Je, 

As for the first term, with a = 0 there is complete specialization and 
all young businesses that survive this period are transferred. As for 
the second term, the fraction of surviving established firms that are 
transferred equals 8. 17 In the STM, the fraction of surviving young 
firms that are transferred is 8, the same as for older firms, so 

Rt™ = 8(1 - R x ). (4) 


1. The Determinants of the Transfer Rate 

We again consider the effects of the parameters \ and p. If either 
parameter increases, 8i and fall, implying that the first term of (3) 
increases. In words, as growth or the pace of change increases, there 
is greater start-up activity as well as lower standards for success, so the 
number of young businesses that survive increases. The net effect on 
the transfer rate is positive if this positive effect outweighs any offset- 


17 An equation for R can be obtained by noting that, making the appropriate 
population norrfflrttiation, plus the number of surviving young firms (the first 
'£rm of [3]) muit equal (1 - 6 + \)0,, the number managing firms in the current 
nod. This equation is used to derive condition (5) b^low. 
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ting effect of the second term. By straightforward calculations, a suf¬ 
ficient condition for a net positive effect is that 

1 - Fy, e,) > ^ (5) 

hold. This condition is satisfied, for example, if the probability of 
initial success for the marginal entrepreneur (the left-hand side) is 
above one-half and if X and 8 are both less than one-third. To deter¬ 
mine the effects in the STM, recall from equation (4) that the transfer 
rate varies inversely with the exit rate R x ■ Hence, from theorem 5, X 
has an ambiguous effect on /?f™. From theorem 6, R*™ decreases 
with p. In summary, we get the following theorem. 

Theorem 7. (i) If condition (5) holds, Rt is strictly increasing in X, 
and if the assumptions of theorem 6 also hold, it is strictly increasing 
in p. (ii) In the STM, the effect of X on /J* rM is ambiguous, while 
under the conditions of theorem 6, an increase in p decreases /if™. 


D. Some Preliminary Evidence on Aggregate Measures 

We present evidence based on a measurement program conducted by 
the Commerce Department in the 1940s and 1950s, which kept count 
of the total number of U.S. business firms as well as the turnover in 
the stock, including the number of new firms, discontinued firms, and 
transferred firms. 8 In theory, any firm with one or more employees 
or a place of business (except agricultural and professional service 
firms) would be included in the Commerce Department population. 
It is the most comprehensive measurement program of business 
transfer that we are aware of. One drawback is that the transfer 
category includes transactions that do not involve changes in business 
operation. However, we are confident that a substantial portion of the 
recorded transfers in the Commerce Department data do involve 
movement of individuals across businesses. 19 


18 For other analyses of business sales, see Lichtenbcrg and Siegel (1987) and Brown 
and Mcdoff (1987). 

19 The Commerce Department transfer data comprise three categories. The first is 
sales of firms. A sale will involve a change in the operator of a business if the business is 
owner operated. The vast majority of firms in this data set are presumably owner 
operated (see n. 20). The second category is changes in partnerships due to the depar¬ 
ture of a partner. These presumably involve changes in the operation of a business. 
The third category is conversion of legal status, e.g., from a sole proprietorship or a 
partnership to a corporation. These do not necessarily involve changes in operation. 
However, these reorganizations make up only a small fraction of measured Commerce 
Department transfers. In his analysis of Internal Revenue Service records, Crum 
(1955) found that there were 20,654 new incorporations due to reorganizations in 
1946, and this represents about 4 percent of total Commerce D^pa^nent transfers for 
that year. 
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TABLE 1 

Turnover Rates per 100 Firms, Average of Annual Rates, 1951-55, 
by Two-Digit Industry in the Manufacturing Sector 


Industry 

Entry 

Rate 

Exit 

Rate 

Net 

Growth 

Transfer 

Rate 

Transfer 

Hazard 

Food and kindred products 

3.3 

4.5 

- 1.1 

5.0 

5.2 

Textile mill products 

6.1 

9.1 

-2.8 

2.8 

3.1 

Apparel 

9.3 

10.8 

-1.5 

4.3 

4.8 

Leather and leather products 

5.3 

7.8 

-2.3 

3.0 

3.3 

Lumber and limber basic products 

17.5 

18.2 

-.8 

5.2 

6.4 

Furniture and fixtures 

7.1 

6.6 

.5 

5.2 

5.6 

Paper and allied products 

4.8 

3.5 

1.3 

3.5 

3.6 

Printing and publishing 

4.5 

3.3 

1.2 

5.9 

6.1 

Chemicals and allied products 

6.0 

5.9 

.1 

3.8 

4.0 

Products of petroleum and coal 

7.2 

3.9 

3.6 

4.6 

4.8 

Stone, clay, and glass products 

5.8 

6.4 

-.6 

5.4 

5.8 

Primary metal industries 

5.2 

4.6 

.7 

4.4 

4.6 

Fabricated metals 

8.8 

5.5 

3.5 

5.9 

6.2 

Machinery except electrical 

10.0 

6.9 

3.2 

6.0 

6.4 

Electrical machinery 

10.4 

6.3 

4.5 

4.1 

4.4 

Transportation equipment 

10.2 

5.5 

5.1 

5.8 

6.1 

Professional instruments 

5.7 

5.4 

.3 

3.8 

4.0 

Rubber products 

7.7 

2.7 

5.5 

4.2 

4.3 

Mean 

7.5 

6.5 

1.1 

4.6 

4.9 

Standard deviation 

3.2 

3.5 

2.5 

1.0 

1.1 


Source.—C hurchill (1959a, tabic 3), 


Firm turnover data from the program are presented for two-digit 
manufacturing and retail industries in tables 1 and 2, respectively 
(note that in this section we are assuming that a two-digit industry 
corresponds to our model economy). The rates are averages of the 
annual rates over the 5-year period 1951-55. Of particular note in 
these tables is the fact that transfer rates are of the same order of 
magnitude as entry and exit rates. Net growth equals average annual 
net growth in the number of firms over the period and corresponds to 
X - 8 in the model. The transfer hazard is the fraction of all busi¬ 
nesses not discontinued that are transferred, that is, R T !{\ - R\) In 
the STM, this equals 8. 

In the manufacturing sector the transfer hazard averages 5 per¬ 
cent. There is very little variation across industries, particularly com¬ 
pared with the variation in entry and exit rates. As 8 = .05 corre¬ 
sponds to an expected stay in the work force of 20 years, it is not 
readily apparent that the STM is incapable of organizing these data. 
However, consider the retail sector. 20 With such wide variance in 
transfer hazards the STM appears in trouble. 

10 Ninety-nine percent of all transfers in retail that took placeibetween 1951 and 1955 
involved firms with 19 or fewer employees (Churchill 19595). Consequently, we are 
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Turnover Rates per 100 Firms, Average of Annual Rates, 1951-55, 
by Two-Digit Industry in the Retail Sector 


Industry 

Entry 

Rate 

Exit 

Rate 

Net 

Growth 

Transfer 

Rate 

T ransfer 
Hazard 

General merchandise 

2.8 

3.7 

-.9 

3.9 

4.0 

Grocery, with and without meats 

4.0 

5.6 

-1.5 

10.8 

11.4 

Meat and seafood 

5.2 

7.8 

-2.5 

8.2 

8.9 

Other food 

5.4 

8.3 

-2.8 

6.8 

7.4 

Motor vehicles 

13.2 

10.8 

2.5 

9.3 

10.4 

Filling stations 

12.0 

7.5 

4.9 

29.8 

32.4 

Automotive parts and accessories 

5.8 

6.0 

-.3 

5.2 

5.5 

Apparel 

7.3 

5.9 

1.4 

6.2 

6.6 

Shoes 

8,2 

5.4 

3.0 

7.0 

7.4 

Lumber and building materials 

4.2 

5.5 

- 1.2 

3.7 

3.9 

Hardware and farm implements 

4.2 

2.7 

1.6 

8.9 

9.1 

Appliances and radios 

11.9 

8.3 

3.8 

6.3 

6.9 

Home furnishings 

9.0 

7.0 

2.1 

5.3 

5.7 

Eating and drinking places 

9.9 

9.5 

.4 

21.0 

23.2 

Drugs 

3.0 

2.7 

.4 

7.2 

7.4 

Liquor 

6.1 

2.8 

3.6 

19.9 

20.5 

Mean 

7.0 

6.2 

.9 

10.0 

10.7 

Standard deviation 

3.2 

2.4 

2.3 

7.0 

7.6 


Source —Churchill (195%, table 3) 


The DLM is capable of generating wide variations in transfer 
hazards, but what about other aspects of the data? Table 3 displays 
cross-section correlations in turnover rates for several data sets. 1 Of 
particular interest are the positive correlations between transfer rates 
and net growth and the positive correlations between transfer rates 
and entry rates. Suppose that \ were to vary across industries in the 
DLM. Then both correlations would result (theorems 5 and 7). Sup¬ 
pose instead that p varied across industries. Then the positive correla¬ 
tion between transfer and entry would result (theorems 6 and 7). 
Note that in the STM this correlation would be negative. 

VII. Conclusion 

We have sought to contribute to the literature on entrepreneurship 
initiated by Theodore W. Schultz. We have developed a theory based 


confident that those transfers that were sales (see n. 19) primarily involved owner- 
operated firms. 

Sl Dunne, Roberts, and Samuelson (1988) have calculated rates of entry and exit for 
more recent data. They find, as we do, a positive cross-industry correlation between 
entry and exit (see table S). 
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TABLE 3 


Cross-Section Correlations in Turnover Rates 




Data Set (Time Period) 


Manufac¬ 

turing 

(1951-55) 

Retail 

(1951-55) 

Services 

(1951-55) 

States 

(1945-50) 

Regions 

(1950-53) 

Entry/exit 

.75 

.74 

.53 

.93 

.67 

Transfer/entry 

.31 

.42 

-.10 

.76 

.66 

Transfer/exit 

-.02 

.16 

-.31 

.82 

.86 

Transfer/net growth 

.42 

.48 

.12 

.57 

.32 

Observations 

18 

16 

9 

49 

7 


Source —Two-dipt industry data. Churchill (1959a); slate and region data. Churchill (1954) 


on two key ideas from this literature: ( 1 ) entrepreneurs are those that 
pursue the new opportunities emerging from technological break¬ 
throughs, and ( 2 ) individuals differ in their ability to pursue such 
opportunities. 

Among the implications of the theory, we find that there will be 
specialization of labor. This specialization will lead to changes in who 
operates particular businesses through time. We have called these 
changes in operation “business transfers.” Preliminary evidence was 
presented that suggests that business transfers represent important 
resource flows and that they serve to facilitate division of labor. But 
most business transfers are not currently measured. And the common 
elements shared by the stories of Steve Jobs and William von Meister 
and the multitude of other individuals changing jobs within and 
across firms to pursue other development opportunities are seldom 
recognized. It is hoped that this situation may change with future 
research. 


Appendix 

A. Calculation of w 

Fix 6 and ([. Assume that at each date, 0 types behave optimally given 4 (i.e., 
according to fig. 1). Let w( 0, f) e (0, 1] be such that if the fraction of active 0 
types who started businesses in all previous periods is constant and equal to 
w( 0 , f), then the fraction starting businesses in the current period also equals 
u>(0, Q). Of course, w( 6 , f) = 0 for 0 in the M region and u/(0, 4) = 1 for 0 in the 
E region. The fraction u>(0, if) for 0 6 J is calculated as follows. Define Z(0, 4) to 
be uie maximum number of periods type 6 would ever manage a business 
before discontinuing it; Z(0, $) is the minimum i such that Y?w(0> 4 )< 4) U 

is nondecreasing in 4)- The fraction u>(0, q) is then determined by (where p = 

y~ l ) 


entrepreneurs htp 
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The left-hand side is the fraction of the 6 population not developing busi¬ 
nesses. The first term in the right-hand-side summation is the fraction of the 
current population who were around last period, (1 - 6)/(l - 6 + X), devel¬ 
oped businesses (ut), and manage those businesses this period (q 6 [q L , q H ]); 
the second term is the fraction of the current population who were around 
two periods ago and developed businesses that they still use today (the origi¬ 
nal productivity must have been between pq L and q H )\ and similarly for the 
remaining terms. Given the properties of q L , qn, and l, w is strictly decreasing 
in ([■ 

B. Calculation of Gross Supply 

Let s(d, q) be the gross supply for each 0, that is, the number of businesses that 
flow into the M sector in the current period plus the number currently held 
by individuals in the M sector that were originally developed by some individ¬ 
ual of type 0 and that will be managed in the current period. Total gross 
supply is the sum of the supply of each type across the J and E regions, S((j) - 
fl +E s(0, <J)d0. Here we explicitly calculate s(0, for 0 67; the case of 0 6 £ 
follows similar calculations. Assume 0 6 J. Define g,(q, 0, <J) to be the probabil¬ 
ity that a business developed by 0 with original productivity o is in the M sector 
when the business is i periods old, The function is defined recursively by 

1 if q > g H (0, 4) 

gi(?, 0, $) = 6 if $ s q s 9 h (0, $) 

.0 if q < 

g,-i + 8(1 “ g,-i) ify~‘<?>$ 

g.(q . 0 , 4) = (l - 1 if qL s V"'? s $ 

.0 if V ’? < ?i*. 

From figure 1, if a business is of age 1 (i.e., was developed in the previous 
period) and q>qn, then it is transferred so that gj = 1. If^ w > q z /}, then the 
business is solid to the M sector if the owner retires. Consider a business i > 1 
periods old that had original quality q. If y'~ x q > /), then either the business is 
already in the M sector (with probability g,_ 1 ) or, if it is held by the original 
owner (with probability 1 - g,- \), it will be transferred if the owner retires. If 
V ‘q G [qf, $], then the business will be in the Af sector when it is i periods old 
only if it was there last period (with probability g;_i) and its current owner 
does not retire. It is straightforward to show that the probabilities g, are 
non increasing in <J. 

The g, functions tell us the fraction of businesses of a given initial quality 
and of a given age that end up in the M sector. It remains to determine the 
number of businesses in this cohort. The quantity a»(0, <{)f(q, 0) is the number 
of quality q businesses developed each period (net growth in population size is 
taken account of shortly). A fraction g\(q, 0,0) of these businesses are in the M 
sector when one period old, a fraction gj when two periods old, and so on. 
Hence, defining k(<J) as the maximum number of periods a business will 
survive (k($) is the minimum i such that y'S < qi \ note that k is nonincreas¬ 
ing in $), we have that the gross supply of businesses of original quality q by 
type 0 is 

-£(!-# + M~ye. $)/(?, «)*.<?, 0, <?)• 

i-i 


s(q, 0 , $) 
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Note that we multiply by (1 - 8 + X)"' to take into account net change in 
population size. The gross supply for businesses q a $ for type 6 e J is 

s( 8 , q) = j V 6 q)dq. 
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Share Tendering Strategies and the Success 
of Hostile Takeover Bids 


David Hirshleifer and Sheridan Titman 

University of California, Los Angeles 


This paper presents a model of tender offers in which the bid per- 
fecdy reveals the bidder's private information about the size of the 
value improvement that can be generated by a takeover. We argue 
that bidders with greater improvements will offer higher premia to 
ensure that sufficient shares are tendered to obtain control. The 
model relates announcement date returns and takeover success or 
failure to the amount bid, the initial shareholdings of the bidder, the 
number of shares the bidder attempts to purchase, the dilution of 
minority shareholders, and managerial opposition. We show that 
managerial defensive measures will sometimes increase the probabil¬ 
ity of the offer’s success, either by raising the incentive to bid high or 
by decreasing the asymmetry of information about the improve¬ 
ment. 


When a hostile bidder makes a tender offer for a widely held firm, 
target shareholders must evaluate competing claims to decide 
whether or not to tender their shares. Bidders typically accuse the 
incumbent management of mismanaging the firm and claim that they 
are offering a fair price for its shares that reflects the higher value of 
the target under their direction. Management, on the other hand, 
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University, Columbia University, Northwestern University, Stanford University, Van¬ 
derbilt University, University of Washington, and the Symposium on Financial Con¬ 
tracting at Indiana University for helpful comments. 
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often accuses the bidder (“raider”) of trying to buy shares on the 
cheap, offering insufficient compensation given the true value of the 
firm's assets. 

The target shareholders’ assessment of these claims generally can¬ 
not be known in advance. Furthermore, shareholders may have 
specific attributes such as liquidity or tax considerations that affect 
their tendering decisions. For these reasons, the outcome of an offer 
is generally uncertain. This is evidenced by the observed negative 
price reactions of target shares on announcements of the failure of an 
offer (Bradley, Desai, and Kim 1983; Samuelson and Rosenthal 1986; 
Ruback 1988) and by the positive reactions of the bidder’s stock price 
to success and negative reactions to failure (Bradley 1980). 

Grossman and Hart (1980) were the first to explain that, because of 
a free-rider problem, target shareholders may rationally turn down 
bids that offer substantial premia over the current market price. They 
argued that if atomistic shareholders of the target firm are able to 
share fully in the improvements brought about by a successful 
takeover without tendering their own shares, they will not accept an 
offer unless the price equals or exceeds the posttakeover value of the 
shares. They further argued that if takeovers are to be profitable, 
bidders must be able to dilute the posttakeover value of the shares 
that are not tendered. The threat of dilution induces target share¬ 
holders to tender at a price that allows the bidder to cover his costs 
associated with the takeover. 

Shleifer and Vishny (1986) pointed out that takeovers may still be 
profitably undertaken without dilution if the bidder had accumulated 
a large fraction of the target firm's shares prior to publicly announc¬ 
ing the offer. Although these large shareholders cannot on average 
profit from the additional shares they purchase, they realize gains on 
the shares they owned prior to the tender offer that are at least 
sufficient to cover their costs. An important innovation in the Shleifer 
and Vishny paper is the introduction of an informational asymmetry 
between the purchaser, who knows the posttakeover value of the 
target firm, and the target shareholders, who do not. Given this asym¬ 
metry, target shareholders cannot be certain whether it is in their 
interest to tender their shares. 

Although the Grossman and Hart and Shleifer and Vishny papers 
provide important intuition about why shareholders might view some 
offers as inadequate, in their analyses no observed bid ever fails. With 
the reservation prices of the target shareholders known with cer¬ 
tainty, a given offer either is high enough to succeed or else will fail 
with ceAainty. Since bids that fail with certainty are obviously unprof¬ 
itable, the bids in these models will be made only at the minimum 
acceptable price. 
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This paper provides a model in which observed tender offers some¬ 
times do fail. The argument above indicates that for offers to fail, the 
bidder must be uncertain about the prices at which shareholders will 
tender. This type of uncertainty will arise if shareholders have some 
personal costs and benefits of tendering (e.g., transaction costs and 
tax or liquidity considerations) that are not known by the bidder. 
Even if these costs are zero, bidders may still be uncertain about the 
outcome of an offer if shareholders follow mixed strategies (i.e., ran¬ 
domize) when they are indifferent about whether or not to tender 
their shares. We show that similar results can be obtained with either 
setting. However, since the mixed-strategy equilibrium is more trac¬ 
table, we relegate the development of the model with random tender¬ 
ing costs to Appendix B. 1 

A fundamental property of the bidding game we describe is that 
shareholders are more likely to accept high bids than low ones. In 
consequence, bidders with low potential gains from the takeover can 
bid low to separate themselves credibly from high-gain bidders. A 
high-gain bidder will not find it in his interest to offer as low a bid 
because rejection is more costly to him. The greater willingness of 
low-value bidders to make low offers leads to an equilibrium in which 
the offer perfectly reveals the information of the bidder, with the bid 
exactly equaling the posttakeover value of the shares. 

A positive relation between the bid premium and the probability of 
offer success is also a feature of recent takeover models with multiple 
bidders (e.g., Giammarino and Heinkel 1986; Fishman 1988; Hirsh- 
leifer and Png 1990). 2 These papers differ from those of Grossman 
and Hartand of Shleifer and Vishny in assuming that offers are made 
to management, rather than directly to shareholders. This assumption 
is more relevant for friendly merger bids, in which the target behaves 
as a unit, than for hostile tender offers, which are subject to a free¬ 
rider problem among shareholders. 3 

This paper also examines scenarios that allow for a free-rider prob¬ 
lem among target shareholders and yet also allow managers to affect 


1 Harsanyi (1973) has shown that in some games, as private shocks to player-specific 
costs and benefits become arbitrarily small, the behavior of the players can be described 
by a mixed-strategy equilibrium. 

s The model is also similar to the recent work by Giammarino and Lewis (1988) that 
analyzes the decision of a firm to issue new shares to finance a known investment 
project. In their separating equilibrium, the higher-valued firm offers share* at a 
higher price, taking the risk that the issue will be rejected. Because their cost of failure 
is higher, lower-valued firms do not mimic this action. 

’ Berkovitch and Khanna (1988) examine the choice of friendly and hostile takeover 
methods in a single model. Morck, Shleifer, and Vishny (1988) have provided evidence 
that tender offers are used in hostile takeovers to discipline poorly performing man¬ 
agement, while merger bids are more likely to be associated with friendly takeovers. 
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the success or failure of bids by resisting offers. We show that the 
managerial defensive measures affect the success of takeovers not 
only by affecting the strategies of bidders but also by affecting how 
shareholders interpret the bid. In particular, some defensive actions 
can potentially increase the likelihood of the bid’s success, either by 
raising the incentive to bid high or by decreasing the asymmetry of 
information about the improvement. As a result, defensive measures 
can potentially improve welfare as well as increase the value of the 
target firm. However, defensive strategies can also be designed to 
entrench management and thereby reduce welfare. 


I. Offer Prices as Signals of Posttakeover Value 

A. The Basic Model 

This section presents the model in its simplest form. We assume that, 
with the exception of one potential acquirer who owns the fraction a 
of the firm’s shares, shareholders of the target firm are atomistic and 
hence view the success of the tender offer as independent of their 
individual tendering decisions. These holdings are determined exog¬ 
enously and are unrelated to the posttakeover value of the firm. If the 
potential acquirer can successfully purchase the fraction 0.5 - a of 
the firm’s shares, he can gain control of the firm and improve its value 
by the amount z per share. This amount, which is bounded above by z, 
is known only to the potential acquirer. 4 To simplify the notation we 
assume that the firm’s value under the incumbent management is 
zero. If the potential acquirer attempts to take over the firm, a condi¬ 
tional tender offer is made for a controlling portion of the .firm’s 
shares. In other words, the bidder makes no purchase unless the 
number of shares tendered is at least as large as the number he has 
chosen to bid for; otherwise the offer fails. 5 We assume that the po- 


4 The model may be expanded so as to make I endogenous. Suppose that the range 
of possible positive values of z is unbounded. We assume that the manager maximizes 
his own expected wealth and possesses initial shareholdings y in the firm. He obtains 
perquisites with value Q from control of the firm, but in deciding whether to accept or 
reject a friendly merger bid (rather than a tender offer), he balances this against the 
profit he obtains from selling his shares, x"y, and allowing the merger to take place, 
where x“ is the premium per share in a merger bid. This implies a critical value for the 
improvement, called Z, above which the bidder prefers to make a friendly merger bid, 
wkuiiii will be accepted, rather than a hostile tender offer. 

*me have also examined unconditional or “any and all” tender offers. Although the 
analysis is somewhat more complex, it yields essentially the same substantive results. 
Currently, a large proportion of tender offers are made unconditionally. However, 
since these offers can be withdrawn prior to the expiration date if the bidder believes 
that an insufficient number of shares will be tendered, we think our characterization of 
conditional offers probably offers a realistic description of unconditional offers as well. 
See Bagnoli and Lipman (1988) for further analysis of “any and ail” tender offers. 
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tendal acquirer gets only one opportunity to bid: if it is rejected, there 
is no opportunity for later upward revisions. The critical aspect of this 
assumption is that the bidder's loss from being rejected is increasing 
with the size of his improvement. At the end of this section, we shall 
discuss ways in which the assumption of just a single bid can be re¬ 
laxed. It is assumed that all market participants are risk neutral. 

Up to this point, the assumptions are essentially identical to those in 
Shleifer and Vishny (1986). They imply that shareholders will turn 
down all bids that are less than the expected posttakeover value of the 
nontendered shares and accept all bids that exceed this value. Shleifer 
and Vishny further assume that shareholders always accept bids that 
make them indifferent. Given this assumption, they show that in equi¬ 
librium all bidders make the same bid, and shareholders always ten¬ 
der because the equilibrium bid equals the expected value as assessed 
by shareholders given that bid. Hence, observed bids never fail. 

To construct an alternative equilibrium in which bids sometimes do 
fail, we begin by describing the problem faced by a bidder. Let x be 
the amount per share bid, and let o> be the fraction of the outstanding 
shares for which he bids. Let P(x; a, to) be the probability that at least 
to shares are tendered to a potential acquirer who bids x and begins 
with an initial shareholding of a in the target. Let C be the cost of 
making a bid. Although C is known to the bidder, it need not be 
known to target shareholders. If more than w shares are tendered, 
the shares are prorated, so that the bidder still pays x per share for oi 
shares. 

We shall propose an equilibrium in which the potential acquirer, if 
he chooses to bid, makes an offer for exactly u> = 0.5 — a shares, 
independent of the level of z. If a bid is made, the level of the bid is 
chosen to maximize his expected gain, 

max [az + (z - x)w].P(x; a, u>) — C. (1) 

X 

If we assume that P(x; a, w) is twice differentiable with respect to the 
amount of the bid, the sufficient first- and second-order conditions 
with respect to x are 

P'[ocz + (z - x)w] - Pto = 0 (2) 

and 

P"[otz + (z - x)u>] - 2P'w < 0. fts (3) 

- 

We assume that for each z there exists an x such that (2) and (3) obtain 
to ensure an interior optimum. 6 Then the following proposition holds 
(all proofs are in App. A). 

6 A rather mild condition on the probability schedule that ensures this is that / > (0) — 
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Proposition 1. If the probability of success P(x; a, id) is strictly 
increasing in the level of the bid, then the optimal bid x(z) is a strictly 
increasing function of the improvement z. 

The intuition is fairly straightforward. Bidders who can realize 
higher improvements are willing to bid higher to increase the proba¬ 
bility that the offer succeeds because they have more to gain from a 
successful takeover. Proposition 1 suggests that the pooling equilib¬ 
rium of Shleifer and Vishny may be sensitive to the assumptions of 
their model that permit the probability schedule to make a discon¬ 
tinuous jump from zero to one. For example, if we perturb the model 
by making y, the information about z possessed by the target share¬ 
holders, imperfectly known to the bidder, then the shareholders’ ten¬ 
dering decisions cannot be foreseen with certainty by the bidder. The 
Shleifer-Vishny equilibrium is based on shareholders’ inference about 
z, E(z\y, x), not rising too rapidly with x, so that if y is a known con¬ 
stant, any bid greater than or equal to E(z\y, x) is accepted with cer¬ 
tainty. But with y stochastic, there will be a probability that x exceeds 
or is smaller than E(z\y, x), so that the offer may succeed or fail. 
Hence, the probability that the offer succeeds, instead of a step func¬ 
tion, will be smoothly increasing in the bid. As proposition 1 demon¬ 
strates, a smoothly increasing probability schedule will induce bidders 
to reveal their levels of improvement through their bids, that is, a 
separating rather than a pooling outcome. 7 

If bids are to be accepted probabilistically, there must be uncer¬ 
tainty about the prices at which shareholders will tender. Such uncer¬ 
tainty arises in our model as a result of target shareholders’ random 
choice of whether or not to tender their shares when they are indif- 


0 and that the percentage rate of increase in probability with the bid is decreasing in the 
bid: 


a r P\x; a, a) 1 
ax |_ P(x\ a, o>) J 


< 0. 


7 Even in the original game, the Shleifer and Vishny equilibrium is sensitive to the 
specification of beliefs. The belief that supports their equilibrium is that all bidders who 
would profit from an accepted low bid are equally likely to make the error of bidding loo 
low. Under these beliefs, the low bid is below the conditional expected value of the gain 
from takeover, so shareholders will always reject. This is in contrast to the Banks and 
Sobel (1987) criterion of universal divinity, which requires that the likelihood of an off- 
equilibrium low bid be assessed to be lower for types for whom such a move is desirable 
under a more restricted set of responses by shareholders. It should be noted that the 
pooling equilibrium is not removed by some other well-known refinement concepts, 
such as the intuitive criterion of Cho and Kreps (1987) and perfect sequential equilib¬ 
rium of Grossman and Perry (1986). However, we believe that the arguments breaking 
the equilibrium are intuitively appealing. Since the risk of having a bid rejected is less 
costly to a low-z bidder thin to a high-z bidder, shareholders should infer that low bids 
are more likely to be associated with low z's. This tends to promote the acceptance of 
low bids, so that the pool evaporates. 
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ferent, that is, when * = £(z|x). If they are indifferent, we follow the 
convention that the target shareholders' tendering strategies are cho¬ 
sen so that the probability of the offer’s success at different levels of 
the bid supports the proposed equilibrium behavior of the bidders. 8 
The intuitive justification for a mixed-strategy equilibrium is that if 
the bid makes shareholders very nearly indifferent about whether to 
tender, then from the bidder’s perspective the actions of the share¬ 
holders will seem random (see n. 1). In Appendix B, we model this 
explicitly, assuming that the shareholders’ tendering decisions are 
deterministic functions of characteristics unknown to the bidder. The 
mixed-strategy model we develop in this section may be viewed as a 
metaphor for a situation with unknown characteristics of sharehold¬ 
ers; it has the advantage of being far more tractable than the model of 
Appendix B, while yielding the same basic intuitions. 

In a mixed-strategy separating equilibrium, the bid must make 
target shareholders indifferent between tendering and not tendering. 
This is the case in the proposed equilibrium in which the bid is fully 
revealing with x = z. To demonstrate that such an equilibrium exists 
and to solve for the probability schedule that supports it, we substitute 
the inference schedule £(x) = * for z into (2) and rewrite the equation 
in terms of x as 


r 

P 


W 

ax 


(4) 


Integrating both sides of (4) over x and rearranging terms yields a 
schedule that expresses the probability of the tender offer’s success as 
an increasing function of the level of the bid, that is, P(x\ a, to) = kx u>/a , 
where A is a constant of integration. The constant k is determined by 
noting that shareholders will accept any bid greater than z with cer¬ 
tainty since they can do no better by retaining their shares. It follows 
that P(z) must equal one (otherwise, the bidder would raise the bid by 
one cent), so the probability schedule is 

P{x\ a, to) = (yp. (5) 

This applies when the expected net profit from bidding, 
axP(x; a, w) - C = a x^ +a)/a z~^ a) - C, 
is positive. The expected profit is increasing in x, so there exists a 

8 For a given bid x, shareholders do not need to coordinate their actions to generate a 
probability of P(x) of offer success. With a large finite number of shareholders, any 
arbitrary probability of success may be achieved when shareholders select independent 
tendering probabilities close to ‘/a. For an analysis of how stochastic outcomes can result 
from a continuum of random variables, see Judd (1985). 
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critical value i c below which no offer is made, determined by equating 
the expected profit to zero. There is also a minimum value of a 
consistent with profitable bidding, a* = C/2. 9 The preceding results 
can be summarized in the following proposition. 

Proposition 2. In the tender offer game described above, a mixed- 
strategy Bayesian equilibrium exists with the following properties: (1) 
The bid equals z (and hence perfectly reveals the bidder’s private 
information) for all z a z c , where z r = ( C /a) o/( “ +a) (2)" /( “ + “ ) . For z < z c , 
no bid is made. (2) The tender offer will be successful with probability 
P(x; a, to) = (x/2) Wo , x E [z f , zj. (3) The bidder will offer to buy to = 
0.5 - a shares, the minimum needed to obtain control. 

In this analysis a bidder with a high z is induced to submit a high bid 
by the dependence on the bid of the likelihood that the tender offer 
will succeed. 10 Since a bidder with a low z gains less from a successful 
offer, he is less willing than a high-z bidder to increase his bid. Simi¬ 
larly, because of a higher opportunity cost associated with the failure 
of an offer, a bidder with high z is not motivated to bid low. Finally, 
there is no incentive to bid for more shares than the minimum needed 
to gain control because, by (5), bidding for more shares reduces the 
probability of success. 11 

B. Empirical Implications of the Basic Model 

The following points summarize a number of empirical implications 
of proposition 2. 


9 For example, if the cost of bidding is 5 percent of the maximum possible improve¬ 
ment, the minimum initial shareholding needed to make bidding profitable is 5 percent 
of the target. Poulsen and Jarrell (1986) report a range of initial holdings of bidders 
varying from 0 percent to nearly 50 percent. The minimum a needed will be smaller 
and can easily be zero if dilution of target shareholders is possible (as in Sec. 1C). 

10 If shareholders do not know the size of C, every sensible bid (i.e., x e (0, T]) is 
viewed by shareholders as possible in the equilibrium. Hence, if the number of shares 
bid for in is taken as given, the separating equilibrium is robust to all the standard 
refinements (e.g., intuitive criterion, divinity, and perfect sequential equilibrium), for 
the simple reason that it does not require shareholders to draw inferences from out-of¬ 
equilibrium moves. 

11 Although we have confirmed that the mixed tendering strategy supports a Bayes¬ 
ian equilibrium, one may wonder whether the belief revisions are credible in the face of 
deviants who bid for a greater number of shares, <i> > 0.5 - a. For example, consider 
the alternative belief that those who did so had low values of z < x. Then their bids 
would always be accepted. This high acceptance rate would encourage bids for more 
than 0.5 - a shares. However, this deviant belief is not consistent because if a bid of x < 
t were always accepted, then those with z > x would also find it profitable to bid x. On 
intuitive grounds, it is not plausible that bidding for more shares is a signal of tow z < x. 
It» when the bidder intends to bid below his value, z > x, that he profits from his share 
purchases and has something to gain by buying a greater number of shares. So the 
proposed belief, which rules out using high share purchases as a way of signaling low 
size of improvements, seems reasonable. 
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Implication 1. The probability of an offer’s success increases with 
the bid premium and with the initial holdings of the bidder in the 
target, and it is decreasing in the number of shares required to obtain 
control. 

The intuition is that when, for example, a supermajority provision 
forces the bidder to make an offer for more shares, the marginal 
savings from underbidding are greater, so a steeper slope of the 
probability schedule is required to deter the highest type of bidder 
with z - l from underbidding. The effect of a on the probability of 
success arises because lower a implies relatively lower potential profits 
from originally owned shares compared with purchased shares, which 
increases the incentive to underbid. So if the bidder has a smaller 
initial holding, it takes a larger drop in probability to deter a high- 
valuation type from underbidding. This is shown in figure 1, where 
for the highest possible bid x “ Z, P(x; a, 0.5 - a) = 1 for both high 



Fig. 1.—Probability of offer success as a function of the level of the bid (ai > a©) 
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and low a, so that for lower a, the schedule is steeper at the right 
endpoint. The result that the probability of success rises with the bid 
premium and with the bidder's initial holdings is consistent with the 
evidence of Walkling (1985). 

Implication 2. The ratio of the stock price reaction at the an¬ 
nouncement of the bid to the bid premium is increasing in both the 
level of the bid and the initial shareholdings of the bidder, and it is 
decreasing in the number of shares required to obtain control. 

This implication is due to the impact of these parameters on the 
probability of an offer's success. 

It is also of interest to examine the effect of varying parameters on 
z c . This leads to the following prediction. 

Implication 3. The average bid premium declines with the size of 
the bidder’s initial holdings in the target and increases with the num¬ 
ber of shares needed to obtain control. 

An increase in a raises the expected profit from making an offer by 
increasing the probability of success and increasing profits in the 
event the offer succeeds. As a result, increasing a makes it profitable 
for lower-type bidders to make an offer, so z r and, hence, the average 
bid decline. 2 Moreover, with bids more likely to be made and more 
likely to be successful, the preoffer market price is higher. This is 
consistent with Walkling and Edmister (1985), who document that the 
average bid premium over the market price is decreasing in the initial 
shareholding of the bidder. 13 Similarly, an increase in the number of 
shares needed to win control, by reducing the probability of success, 
causes z c to increase, raising the expected premium. If the number of 
shares needed for control varies across firms, this is also consistent 
with the evidence of Walkling and Edmister, who found that a 0-1 
variable indicating whether more than 0.5 - a shares were sought in 
the bid had a positive impact on average premia. 

Implication 4. Activities that reduce the degree of asymmetry of 
information between bidder and shareholders, such as the payment 
of solicitation fees to persuade shareholders to tender, are predicted 
to increase the probability of offer success. 


la Algebraically, the formula for z r in proposition 2 declines as long as a > Clei, 
where e is Euler’s constant. This must hold in the relevant ranire of z's I, which implies 
a a C/I. 

13 However, the evidence of Franks and Harris (1988) is only partially supportive. 
Consistent with the prediction, they find that the premium is lower if the initial share¬ 
holding exceeds 30 percent than if it is positive but less than 30 percent. However, the 
premium is also lower in the third category of zero shareholdings. This is consistent 
with the hypothesis that offers made without an initial shareholding are made by 
bidders with a credible threat to dilute target shares, leading to lower bid premia (see 
Sec. 1C). 
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A reduction in the degree of asymmetry of information, that is, a 
mean-preserving inward shift in the distribution of z that lowers I, the 
maximum possible value of the improvement, will, by proposition 2, 
raise the probability of offer success at any given level of the bid. In 
consequence, the bidder will be motivated to take actions to communi¬ 
cate information about the source of and likely magnitude of the 
improvement to target shareholders. This could be one function of 
hiring brokers to solicit shares. Consistent with our analysis, Walkling 
finds that the payment of solicitation fees does increase the probabil¬ 
ity of takeover success. 

C. The Solution with Dilution 

Grossman and Hart (1980), in a model with only small shareholders, 
stress that for tender offers to be profitable some means by which a 
successful bidder can dilute the value of minority holdings is 
needed. 14 Shleifer and Vishny pointed out that if there is a large 
shareholder, he can profitably initiate a bid without dilution. In this 
subsection we examine a solution in which the potential bidder has 
large shareholdings and also has the power to engage in some dilution 
of minority shareholders. 

Suppose that the amount by which minority shares may be diluted 
contains a fixed component 8 0 and a proportional component 8 1( 
where 8 0 and 8] < 1 are known constants. A dilution by 8 0 + z8 1 
means that after obtaining control, the bidder can reduce the value 
per share, so that the posttakeover value of minority shares is z(l - 
Si) - 8 0 . 15 Since target shareholders do not know z, this assumption 
implies that they also do not know the posttakeover value of their 
shares. 

In a perfectly revealing equilibrium, shareholders will be just indif¬ 
ferent about whether to tender if they receive a bid they believe to be 
equal to the posttakeover value, 

x = z(l - 81) - 8 0 . (6) 


H Dilution refers not just to expropriation of assets of the target, but to sharing some 
of the gain from the improvement. We therefore view dilution as widely prevalent, and 
despite the connotation of the word, it need not indicate any malfeasance or predatory 
behavior on the part of the bidder. 

15 More generally, one might allow 8 0 and 8 , to be random. However, this would 
change the solution radically because the probability needed to persuade a bidder to 
bid “truthfully "—x = z(l - 8 ,) - 6 0 —will in general depend separately on the bidder’s 
z and on his 8 c. for a given z(l ~ 81 ) — 8 0 . the gains to success are greater for a bidder 
with higher z since his profit on his own shares is larger. We conjecture that this should 
lead to a solution in which bidders sometimes overbid and sometimes underbid but in 
which the bid is equal to £[z(l - 6 ,) - 8 o|x] (to keep the shareholders indifferent). 
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The probability schedule will depend on 8 0 and S 1( but we suppress 
all arguments except for x and write the probability of success as P(x). 
The bidder’s objective function is then 

max [az + <o(z — x) + 0,5(8o + 8iz)]P(x). 16 (7) 


Following steps that parallel the previous section, we may take the 
first-order condition of the bidder’s problem, invert the bidding func¬ 
tion (6) to obtain the inference schedule £(x) = (x + 8 0 )/(l - Sj), and 
substitute i(x) for z in the first-order condition to obtain a differential 
equation for P(x). Shareholders will always accept the highest possible 
bid, x a* (1 - 8])z - 8 0 , so P(x) = 1. This boundary condition implies 
the probability schedule 



x + PoPi 
X + PoPi 



( 8 ) 


where p 0 * 8o/w(l - 8j) > 0, and Pi “ to(l - 8j)/[a + 8j(l - a)] > 0. 
As x is linearly related to z, the probability of success is indirectly a 
function of the improvement z, 



where, by its definition, P! < co/a, and d = (PoPi — 8 0 )/(l - 8i) > 0. 
Since the displacements PoPi, d> 0 are increasing in 8 0 and 8^ while 
Pi is decreasing, the probability of success is uniformly higher than it 
is in the basic model and increases with the amount of dilution. 

Profits increase with dilution both directly in (7) and as an indirect 
result of the increased probability of success. As a result, z c decreases. 
Dilution therefore raises the probability that a bid will be made, as 
well as raises the probability of success of an outstanding offer. It 
remains the case with dilution that higher initial shareholdings a in¬ 
crease the probability of offer success and reduce the critical value z'. 

The intuition for why the probability of success increases with the 
amount of dilution (8 0 and 8;) is roughly that the effect of being able 
to dilute is similar to the effect of raising a described in Section IA. 
Higher dilution increases the cost to underbidding because a given 


The term 0.5(8 o + S[Z) may be interpreted in two ways. Finally, if dilution has no 
deadweight costs, it reflects the profit to the bidder when he can successfully appropri¬ 
ate resources from the minority shareholders. Alternatively, if dilution is costly, after 
obtaining a majority of shares from a tender offer, with the value of the improvement 
revealed, the ladder can jBftke a cleanup offer for the remaining shares, setting the bid 
in the cleanup offer equal to z( 1 - 8,) - 6o. Since the bidder would profitably dilute the 
minority were this offer to fail, the remaining shareholders are ajl willing to tender. 



SHAKE TENDERING STRATEGIES 


307 

drop in probability reduces the expected gain arising from the term 
0.58iz by a greater amount. When z is high, this implies a shallower 
slope of the probability schedule when 81 is high. But when z is low, 
the differential in probabilities between a low- and high- 81 bidder 
becomes larger, increasing the benefit for a bidder to underbid if 81 is 
high. 

The model predicts that dilution threats will be reflected in the 
“first tier” of the offer, so that minority shareholders do no worse 
than those who tender. However, in Section ILB2 we shall see that 
when managers take defensive measures, “overbidding” is possible, so 
that the bid can potentially exceed the postoffer share price. 

D. Costs of Bid Failure 

The preceding model assumes that if target shareholders reject the 
bid, the bidder will lose the target with certainty and hence will suffer 
costs that are increasing in the size of his improvement. In reality a 
bidder who is rejected by shareholders may be able to revise his bid 
and ultimately succeed. If a first failure were entirely costless, then a 
low-type bidder would be unable to separate from a high-type bidder 
on the first bid. 

There are a number of reasons why rejection of an initial bid may 
ultimately result in a failed or at least a less profitable acquisition. For 
example, the rejection may result in the loss of a window of opportu¬ 
nity, such as a reduction of synergies. Alternatively, failure of the 
initial bid may give management or labor unions more time to 
mobilize legal or asset structure defensive activities. If the manage¬ 
ment response blocks the takeover, it leads to a loss to the bidder that 
is consistent with the assumptions of the basic model. A failed bid may 
also give management the opportunity to learn about and take steps 
to preempt the policies planned by the bidder, increasing the firm’s 
pretakeover stock price and making the takeover unprofitable. If the 
bidder profits only on his initial stake (a), this could in principle help 
him as much as an actual acquisition. However, if incumbent manage¬ 
ment cannot implement the improvements or synergies efficiendy, or 
if (as will normally be the case) the bidder can appropriate some 
positive fraction of a takeover improvement through dilution, then 
the attempt to preempt the improvement imposes an opportunity cost 
on the bidder that increases with the size of the improvement. 

Another important considerauon is that rejection may give compet¬ 
ing bidders ume to enter. Again, this would involve a loss of the gains 
associated with dilution. If we set a = 0 in the model with diludon, 
the soludon is idendcal to that of an alternative model in which a 
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failed offer always results in the appearance of a competing bidder 
who can successfully purchase the firm by matching the initial bid. 
The objective function of the first bidder is then just (7) with a = 0, 
and the probability schedule is precisely (8) with (Si = o»(l - 8i)/8j. 
This illustrates simply that a separating equilibrium can be enforced 
by a cost of failure arising from a competing bidder who appropriates 
the potential dilution. 17 


II. Management Defensive Actions 

There has been a great deal of debate about whether managerial 
defensive measures are in shareholders’ interests or whether they are 
a means of entrenching managers pursuing their own objectives (see, 
e.g., Easterbrook and Fischel 1981; Gilson 1981; Bebchuk 1982). In 
the next two subsections we examine three categories of defensive 
measures: contingent cost defensive strategies, which impose costs on 
the bidder only in the event that he is successful; pretakeover costs, 
which are imposed on the bidder prior to the outcome of the offer; 
and blocking defensive strategies, which increase the likelihood that 
the bid will be disallowed for legal reasons. Our analysis takes these 
defensive measures as exogenous and examines their effects on the 
amount that the bidder offers, the tendering strategies of target 
shareholders, and the probability of offer success. We show that while 
some defensive measures reduce shareholder value, others can in¬ 
crease both the amount bid and the probability of the offer’s success. 

A. Contingent Defensive Costs 

A number of defensive strategies impose costs on the bidder only in 
the event that a tender offer succeeds. These strategies may redis¬ 
tribute wealth from the successful bidder to the nontendering share¬ 
holders; they may impose deadweight costs on the bidder without 
affecting the value of untendered shares or they may reduce the 
posttakeover value of the remaining shares as well. These distinctions 
are important since the type of cost imposed affects Both the share¬ 
holders’ tendering decisions and the bidder’s strategy. 

The following is a general model that incorporates all three pos¬ 
sibilities. As a special case, in Section lid 1, we examine “poison pills,” 
in which the loss to the bidder, should the pill be triggered, is fully 
redistributed to the remaining target shareholders; in Section IIA2, 
we examine value-reducing measures (“sale of the crown jewels”), in 

■ i 

17 A competing bidder model in which the initial stake of the first bidder is positive is 
available from the authors on request. 
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which the shareholders who do not tender are also hurt by the defen¬ 
sive measure. 18 

Let h(z) be the cost per share imposed on the shares either originally 
held or purchased by the bidder should the bid succeed. If the 
takeover succeeds, the value per minority share is increased by an 
amount z + eh(z), where — 1 ^ e s 1, h(z) > 0, and h'(z) a 0. Here h(z) 
will generally be increasing in the improvement. For example, in a 
discriminatory rights offering to all shareholders but the bidder (a 
poison pill), the rights will be worth more when the firm is worth 
more. The term e is a redistributive parameter that reflects the fact 
that the target shareholders may not fully appropriate the costs im¬ 
posed on bidders. 

In a mixed-strategy separating equilibrium, the bidder must bid the 
value of the target shares should the takeover succeed, inclusive of the 
redistribution, so that 


x = z + eh{z). (9) 

Given the costs imposed on the bidder, his problem is 

max [az + (z - x)a> - 0.5/i(z)]P(x). (10) 

X 

If h is linear, that is, h(z) = a + bz, where b < 1, the equilibrium is 
derived from the first-order condition of (10) by substituting for z 
using (9) and solving the differential equation for P(x). Imposing the 
boundary condition of certain success at the highest bid, P(x) = P[z + 
eh{z)] = 1, we obtain 



x - (B/A) 
x - (B/A) 


- w(I + eb)/\a — b(utt + 0.5)J 


(ll) 


where A = [a — b (toe + 0.5)]/u>( 1 + eb ), andfl = [0.5a(l + e)]/co(l + 
eb). Insight into particular forms of managerial defensive measures 
may be derived by examining special cases of this model. The follow¬ 
ing subsections examine cases in which e — — 1 and l. 19 


18 A sale of valuable assets prior to a bid can still be considered a “contingent” cost, in 
the sense that the bidder's wealth is reduced by this action more if his offer succeeds 
than if it fails. Of course, from the target shareholders' point of view, a measure that 
becomes operative only if the takeover succeeds (such as legislation limiting investment 
changes by new management) may be very different from a sale of assets that becomes 
operative regardless of whether a takeover occurs. 

19 The intermediate case of e = 0 corresponds to defensive measures that, while 
imposing costs on the bidder in the event of success, do not affect the wealth of the 
minority target shareholders. This may approximate the effects of charter amend¬ 
ments providing for staggered terms of directors. When a bid succeeds, these amend¬ 
ments may force the bidder to suffer further litigation costs, or the costs of a proxy 
fight, before implementing his program. In our model, such measures imply a uniform 
reduction in the probability of offer success. 
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1. Poison Pills: Redistributive Defensive Measures 

A poison pill is a defensive measure that redistributes wealth from the 
bidder to the shareholders who do not participate in the tender offer 
if the bidder accumulates a sufficient number of shares. Such a redis¬ 
tribution corresponds to e = 1 in (11), implying that the probability 
schedule is identical in form to the schedule in the model with dilu¬ 
tion, with 8 0 = -a and 8i = - b. In other words, poison pills can be 
viewed as mechanisms that generate negative dilution. In conse¬ 
quence, the poison pill unambiguously reduces the probability of suc¬ 
cess and increases the bid for a given level of the improvement. Ryn- 
gaert (1988) and Malatesta and Walkiing (1988) found a negative 
average stock price reaction to the announcement of poison pills. This 
is consistent with our model to the extent that the reduction in the 
probability of takeover outweighs the benefit to shareholders of being 
able to extract a higher premium. 

2. Value-Reduction Strategies 

We now examine defensive measures that impose costs on the target 
shareholders as well as the bidder, should the takeover succeed. One 
such measure is to lobby for legislation that outlaws the investment 
changes the bidder wishes to make. Other examples are the “scorched 
earth" or “sale of crown jewels” defensive measures, in which the firm 
sells off those divisions or assets whose values can be improved. 

A value-reducing strategy is reflected in the current model by a 
negative e, so that reductions in bidder wealth are associated with 
reductions in the improvement in target shareholder wealth; We ex¬ 
amine the pure case in which e = - 1. It should be stressed that we 
are considering a defensive measure that reduces the size of the im¬ 
provement from a takeover. The sale, at below the market price, of an 
asset that cannot be improved by the acquirer or any other measure 
that reduced firm value without altering the amount by which it could 
be improved would have no effect on the probability schedule or on 
the premium (x) offered above the firm’s value under current man¬ 
agement. When e = - 1, the reduction in value per share is the same 
for the bidder and for the nontendering shareholders. This case is 
equivalent to,a reduction in z at all its values. Hence, by (11), 

PM - (fp. (12) 

as in the basic modef. Note that since x = I - h{l) < 1, the probability 
of acceptance as a function of the amount bid rises. This is not surprising 
since a given bid becomes more attractive when compared with a 
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reduced value of the improvement. However, the amount bid will also 
be reduced, so the probability of success of a bidder with a given 
improvement of z may or may not be reduced. The level of the bid is 

x = z - h(z) = -a + (1 - b)z, (13) 

which when substituted into (12) gives the probability schedule in 
terms of z of 


P*(z) 


» - [a/(i - m 

z - [a/(l - b)] 


u)/a 


(14) 


A value-reducing defensive strategy can either increase or decrease 
the severity of the information asymmetry between the bidder and 
the other target shareholders, depending on the values of a and b. A 
fixed reduction in the size of the improvement, that is, a > 0 and i> = 0, 
makes the improvement more uncertain relative to its mean and 
hence reduces the probability of success as well as the profits in the 
event of success. Alternatively, with a - 0 the probability of offer 
success is unchanged. The prior uncertainty about the improvement 
relative to its mean is unaffected by such a measure; however, both 
the level of the bid and the profits of the bidder are reduced. There¬ 
fore, such a strategy can deter a potential bidder from attempting the 
takeover. 20 

Perhaps most interesting, if a < 0 and b > 0, the asymmetry of 
information about the increase in the value of target shares (net of 
defensive costs) that the takeover will bring about is diminished. 21 By 
(14), it follows that this value-reduction measure increases the proba¬ 
bility of offer success. Intuitively, it is asymmetry of information that 
leads to bid failure, and to the extent that this can be reduced, the 
frequency of acceptance is raised. 22 

The results of Section IIA can be summarized by the following 
proposition. 

Proposition 3. For poison pills (e = 1), the probability of success as 


10 Dann and DeAngelo (1988) examine a number of cases in which targets sell assets 
for defensive purposes, and they provide evidence that stock prices decline on average. 

S1 To illustrate, let t = z, + z 2 , where ? t is an improvement of known value and z s is 
an improvement whose value is unknown to target shareholders. These may be viewed 
as two projects that the bidder could undertake. Consider a measure that imposes costs 
b'zs that are proportional to zj, so that h(z) = b'z 2 = b'(z - I,) = b’z - b'J\. Hence, this 
measure is subsumed by our general framework with 6 = 6 ' and a = — 6 % < 0. It is 
worth noting that in this example, z a I| > 0; i.e., the improvement is bounded from 
zero. Otherwise, the specification would imply that the defensive measure could be 
value increasing for small z, which does not seem plausible. 

” For similar reasons, a value-increasing strategy such as preemption by the target 
management of the planned improvement could decrease the probability of success. 
This corresponds precisely to the fixed reduction in the improvement, a > 0 and 6 = 0, 
discussed in the paragraph above. 
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a function of cither the level of the bid or the size of the improvement 
(z) is lowered. For value-reducing defensive measures (e = - 1), the 
probability of success as a function of the bid is higher than in the 
basic model without defensive measures; however, the probability of 
success as a function of the bidder’s improvement can be greater, 
lower, or the same as when there are no defensive measures. A value- 
reducing defense can either raise, lower, or leave unchanged the 
probability that an offer will be made. 

B. Litigation 

Litigation by incumbent management imposes costs on the bidder 
and can in some cases directly block a takeover. Any legal costs under¬ 
gone by the bidder prior to the outcome of the offer are uncontingent 
in that they are expended even if the bid should later fail. We exam¬ 
ine separately the effect on the bidder’s strategy of the uncontingent 
legal costs that are imposed (Sec. IIB1) and of the possibility of block¬ 
ing the bid (Sec. IIB2). 

1. Defensive Costs Imposed prior to 
the Offer Outcome 

The analysis that follows assumes that the magnitude of defensively 
imposed costs depends on the amount bid. 23 In this case, the bidder’s 
objective is 

max [a z + (z — x)u >]P(x) — k(x), (15) 

X 

where h(x) is the cost imposed on the bidder by the managerial defen¬ 
sive action prior to the offer outcome, h'(x) :£ 0. Taking the first-order 
condition of (15) and substituting the condition for a fully revealing 
equilibrium that z = x gives a linear first-order differential equation 
for the probability schedule as a function of x. The following proposi¬ 
tion, which assumes that the model parameters are such that a mixed- 
strategy equilibrium exists (i.e., x = z), can be proved by solving this 
equation subject to the initial condition P(z) = 1. 

Proposition 4. In the tender offer game with costs imposed by 
management defensive litigation, when a separating equilibrium ex¬ 
ists, (1) the probability of an offer’s success in the mixed-strategy 


85 The assumption that the legal cost of the offer is decreasing in the size of the bid 
arises from the possibility.that courts may be more sympathetic to defensive suits if a 
low price has been offered to shareholders; e.g., the statutes of a number of states give 
target shareholders appraisal rights. Alternatively, management ma^ not light high 
bids as hard as low bids. 
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signaling equilibrium is < • 

if this quantity is uniformly less than or equal to one, where Po(x) is 
the probability schedule that applies when h' « 0, as in the model 
without target-imposed costs; (2) if h' < 0, defensive measures in¬ 
crease the probability of an offer’s success. 

To see that part 2 is true, note that the integrand is a negative 
quantity ( h' < 0) and the limits indicate a backward interval (z s *), so 
P(x) is higher under the new solution than under the old one. This is 
intuitive: if we start from the endpoint x = z and reduce x, the proba¬ 
bility may fall less and still deter a lower bid since a lower bid would 
lead to higher litigation costs. So when a litigation strategy is pursued, 
the probability of takeover success, given that a bid is made, can 
actually increase! 

Litigation by the target can also lead to offers that exceed the mag¬ 
nitude of the improvement, x > z. For example, if the manager is able 
to impose high costs on all bids below z, some bidders with improve¬ 
ments below z increase their bids to x = z. In consequence, it is 
possible that imposing costs on bidders may be in the target share¬ 
holders’ interest for two reasons: first, because this can raise the prob¬ 
ability of an offer’s success and, second, because it can increase the 
pressure to make a higher bid. Jarrell (1985) provides evidence that 
litigation increases the takeover price and, hence, can sometimes be in 
the interest of target shareholders. 

2. Blocking Defensive Measures 

In addition to imposing a cost on bidders, management opposition 
may be able to force the bidder to withdraw his offer for legal reasons. 
Let T(x) equal the probability that 0.5 - a shares are tendered, and 
let U(x) equal the probability that the offer is not blocked by the 
courts. It is assumed that the two sources of failure occur indepen¬ 
dently. Let P(x) be defined as the overall probability that a bid suc¬ 
ceeds, 

P(x) = T(x)U(x). (16) 

The objective of the potential acquirer, if he chooses to bid, is still (1), 
leading as before to the first-order condition (2). Any solution to this 
differential equation for which x = z, consistent with the randomiza¬ 
tion of shareholders, satisfies 

T(x)U(x) = kx^. 


(17) 



JOURNAL OF POLITICAL ECONOMY 

When we solve for Ax), the difference between this problem and that 
of Section IA is that, with T(x) £ 1, the additional constraint P(x) s 
U(x) is imposed. 

Sometimes the ability to block the bid is not based on the level of the 
bid (e.g., “antitrust” defensive lawsuits), in which case the blocking 
probability U(x) is a constant. Then since Po(*) - 1»the schedule P(x) 
is necessarily lower at the right endpoint, as illustrated in figure 2. 
The resulting schedule may be determined by applying the initial 
condition P(z) = U to (17). This yields the solution 

P(x) = (y)“V (18) 

A comparison of (18) with (5) illustrates that in this case the blocking 
defensive measure uniformly reduces the probability of the offer’s 
success. 24 

In other cases, the probability of the success of a legal action is 
decreasing in the amount of the offer, that is, [/'(*) 2 0. 25 If in addi¬ 
tion U(x) is never below Po(x), then the boundary condition for P(x) in 
the basic model may be applied to (17) without modification. Hence, 
the resulting probability schedule is the same as that in the basic 
model, implying that the direct loss in probability due to defensive 
action is precisely offset by an increased willingness to tender! More¬ 
over, as the payoffs to the bidder are exactly the same as before, z c also 
does not change. So the defensive measure will be entirely ineffective 
in promoting either shareholders’ goals or those of an entrenched 
management. This extreme case illustrates the more general point 
that shareholders may compensate for blocking measures by increas¬ 
ing their willingness to tender their shares. 

The intuition can be seen by imagining that the tendering probabil¬ 
ities of shareholders were unchanged. Then opposition would in¬ 
crease the incentive of a bidder with a given z to bid high to avoid 
being blocked. A given bid would thus become more attractive to 
shareholders. The increased willingness of shareholders to tender at a 
given bid can offset the direct probability-reducing effect of manage¬ 
ment defensive measures. 26 

!4 More generally, if \J(x) < (x/I)“ /a for any * 6 (0, z], then the original schedule P U M 
becomes infeasible. The reduction in probability of success imposed by the defensive 
action is binding because for some *, even were shareholders to tender with certainty, 
the probability of legal success would be less than P 0 (*)- However, the reduction in 
probability that results could be slight. 

** "Hu* assumption may be justified by arguments similar to those in n. 23. 

6 Further insight into the source of the greater willingness to tender arises from the 
model of App. B. There, defensive measures can give the bidder an incentive to bid 
higher, and the higher bidding not only makes shareholders less skeptical in their 
assessments of z but also raises the probability that the excess of x over the assessed i is 
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Fic. 2.—Effect of binding defensive measures on the probability of offer success 


A sufficiently steep U(x) schedule may lead to overbidding, thereby 
rendering infeasible a mixed-strategy equilibrium in which bidders 
offer the true value of the improvement. Under the conditions of 
proposition 1, bidders still separate because the slope of the C7(x) 
schedule provides an incentive for high-z bidders to make higher 
offers than low-z bidders. In this case the threat of defensive actions 
increases the bids and can thus increase the probability of takeover 
success. Overbidding in some acquisitions is consistent with the evi¬ 
dence of Bradley (1980) that the posttakeover value of those shares 
not tendered is on average lower than the tender price. 27 


large enough to outweigh the costs of tendering. Here, with shareholders just indiffer¬ 
ent, the willingness to tender is infinitely elastic with respect to the bid, so the compen¬ 
sation in tendering probabilities is brought about without any rise in the bid at all. 

87 As a simple illustration, suppose that z can take on just two values, Zand z. Suppose 
that U(Z) “ 1 but U(x) = 0 for all * < l. Then if a and z are sufficiently high, it will pay 
for the lower-value bidder to overbid, * = Z, to be accepted with certainty, since the gain 
on his own shares will exceed the loss on the shares he purchases. 
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III. Welfare and Regulatory Implications 

If we assume that the gain in value from a takeover arises from 
increasing operating efficiencies rather than expropriation of noneq¬ 
uity shareholders (e.g., bondholders and labor) and if we assume that 
the direct resource costs of defensive measures are small, then defen¬ 
sive actions that decrease the probability of takeover success decrease 
welfare. Taking this view, Easterbrook and Fischel (1981) argue for a 
passivity rule for management on the grounds that extracting a 
higher bid is merely a redistribution, without net benefits but with the 
social cost of deterring some potentially profitable synergies. Our 
model suggests that defensive actions need not reduce social welfare. 
By forcing the bidder to increase his offer or by reducing the asym¬ 
metry of information about the posttakeover value of the target, they 
can increase the probability of the offer’s success. 

For example, we show that certain blocking defensive measures as 
well as cost-imposing litigation strategies force the bidder to raise his 
bid to a level that leads shareholders to tender with certainty, while 
some forms of value-reducing strategies also increase the probability 
of the offer’s success by reducing the asymmetry of information. In 
some cases, the bidder as well as the target is made better off by 
defensive measures. This will occur when the rise in the probability of 
success outweighs the higher payment the bidder is forced to make or 
the loss arising from the reduction in the target’s value. In conse¬ 
quence, if the model were extended to consider the decision of bid¬ 
ders to investigate the target, it is possible that defensive measures 
could lead to a higher overall probability of takeover and higher social 
welfare. 

However, some defensive measures such as poison pills act to re¬ 
duce dilution and thereby increase the offer price while lowering the 
probability of offer success. As in Grossman and Hart (1980), our 
model implies that managers that act in shareholders’ interest will 
take defensive actions to reduce dilution in order to raise the bid 
above the socially optimal level. Shareholders will support this activity 
because they bear only part of the social cost associated with the 
reduced probability of an offer’s success and capture a transfer gain 
from raising the offer price. 

This suggests that there may be some role for regulations that limit 
the use of defensive measures. However, it should be stressed that this 
argument takes the investment decisions of the target as given. If 
there exist preoffer investments that a target can make that increase 
synergies, then to encourage investment it may be socially preferable 
to allow targets to take acdons to capture more of the synergistic 
benefits. 
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Our analysis further suggests that the Williams Act, by facilitating 
defensive actions, may have promoted overbidding in hostile takeover 
contests. The risk in bid premia subsequent to the Williams Act de¬ 
scribed by Jarrell and Bradley (1980) is consistent with this hy¬ 
pothesis. An alternative truncation hypothesis is that a rise in the cost 
of bidding will raise the average bid premium by deterring profitable 
takeovers with relatively lower gains. 28 Overbidding can explain not 
only the higher premia paid to targets but also the lower abnormal 
returns to bidders, found by Jarrell and Bradley, after the Williams 
Act and later state acts. 29 Malatesta and Thompson (1988) provide 
evidence that a wealth transfer from bidders to targets (i.e., higher 
bids) was more important than truncation of the sample in causing 
the rise in target mean premia and lower abnormal returns to bid¬ 
ders. 

The discussion above suggests that regulation that facilitates defen¬ 
sive action can potentially increase welfare, but need not do so. Our 
analysis indicates that a necessary but not sufficient condition for a 
defensive strategy to increase welfare is that it benefits target share¬ 
holders. A recent paper by Jarrell and Poulsen (1987) indicates that, 
on average, antitakeover amendments lead to negative target stock 
price reactions, suggesting that they may, in general, be welfare de¬ 
creasing. However, the price reaction is not always negative and is on 
average more positive in those cases in which a large fraction of the 
firm is held by institutional shareholders, who are presumably better 
able to block amendments that oppose their interests. This suggests 
that antitakeover amendments sometimes are in the interest of target 
shareholders and thus may sometimes improve social welfare. 

Our analysis of the desirability of defensive measures may be sensi¬ 
tive to the assumptions about the effects of offer failure. The possible 
welfare gains arise from forcing up the bid or reducing informational 
asymmetry, which leads the bidder to succeed with higher probability. 
In practice, however, a failed first bid can be followed by a revised bid 
or a competing bid. If the target does not take defensive actions, it is 
likely to be taken over eventually, either by the initial bidder or, if he 
fails, by another bidder. This suggests that although defensive actions 
can increase the probability that an initial bid will succeed, it is un¬ 
likely that they will increase the probability that the firm will ulti¬ 
mately be taken over. 

If an initial failure leads to less efficient implementation of the 


28 Our model is also consistent with the truncation effect. Jarrell and Bradley's dis¬ 
cussion combines features of both explanations. 

28 Lower bidder returns could also be due to the increased costs imposed by defen¬ 
sive measures. 
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improvement, because management either preempts the planned 
changes inefficiently or finds a white knight who does so, then impli¬ 
cations similar to our analysis apply. On the other hand, if an initial 
failure does not prevent the target from ultimately being acquired by 
either the first bidder or a competing bidder with either an equal or a 
greater improvement, then our argument must be modified. In this 
case, defensive measures that reduce the probability of initial success 
can be socially beneficial if they give higher-improvement bidders 
more time to make offers. 


IV. Conclusion 

This paper presented a model of tender offers in which the bid per¬ 
fectly reveals the bidder’s private information about the size of the 
gain that can be generated by a takeover. The magnitude of the 
tender offer premium affects the probability that a bid succeeds, so 
that bidders with high-valued improvements who have more to gain 
from the offer’s success make higher bids than those bidders with 
lower gains. The model provides a number of testable implications 
relating to the determinants of an offer’s success. For example, we 
have shown that both high initial holdings by the bidder and the 
possibility of dilution of minority shareholders increase the probabil¬ 
ity that an offer will succeed. 

Our analysis has also demonstrated that the tendency of target 
shareholders to participate in a tender offer is affected by the man¬ 
agement’s defensive strategies. Some defensive actions can actually 
raise the probability of the offer’s success. These strategies may-bene¬ 
fit shareholders. In addition, even defensive measures that could po¬ 
tentially cause the bid to be disallowed can benefit shareholders ex 
ante, by inducing bidders to make higher offers. Furthermore, there 
is a tendency for shareholders to raise their probabilities of tendering 
in response to managerial defensive actions in an offsetting manner, 
so that in some cases the defensive measure will not lead to any net 
reduction in the overall probability of an offer’s success. Of course, 
defensive measures can also be designed to reduce the probability of 
an offer’s success in ways that can reduce shareholder value. 

The model suggests that a key determinant of the outcome of ten¬ 
der offers is whether target shareholders know as much as the bidder 
about the posttakeover value of the target’s shares. If information is 
symmetric, then a bidder can always purchase as many shares as he 
seeks by bidding one cent above the posttakeover value in a tender 
offer. With asymmetric information, even the strategy of overbidding 
will not necessarily assure the success of the offer because target 
shareholders will interpret the higher bid as an indication that the 
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posttakeover value of the shares is higher. This effect may be more 
apparent in the model presented in Appendix B in which the equilib¬ 
rium bid is stricdy lower with asymmetric than with symmetric infor¬ 
mation. 

As shareholders become better informed about the size of the post¬ 
takeover value, the likelihood of a bid that is prone to failure (i.e., a 
bid far below the bidder’s maximum improvement z) becomes small. 
For bids in which a merger is expected or for management buy-outs, 
the posttakeover value of minority shares is often determined by a 
court decision about the fairness of the price. In this case, the bidder’s 
information may be litde better than that of target shareholders. 
Therefore, in the absence of management defensive actions and com¬ 
peting bids, we expect takeover bids for merger usually to succeed. 
On the other hand, in takeovers initiated to change the policy of the 
target without merger (e.g., Carl Icahn’s takeover of Trans World 
Airlines), the bidder may have superior information about the pros¬ 
pects for increasing firm value. In such cases, failure of the bid be¬ 
comes more likely. Our model suggests that future empirical studies 
should examine samples of these different kinds of takeovers sepa¬ 
rately. 

Like most theoretical work on this topic, our model has assumed 
that the bidder is rational and profit maximizing. However, others 
(e.g.. Roll 1986) have suggested that bidders may be afflicted with 
“hubris,” systematically overestimating their ability to improve firm 
value. In our model, a bidder would have his bids accepted with 
certainty if target shareholders believed that he was overly confident 
and had a tendency to overbid. This suggests that “rational" bidders 
may have an incentive to develop and maintain reputations for hu¬ 
bris. Hence, in a repeated game, it may pay a rational bidder to 
overbid, to persuade future targets that he too is prone to hubris. 30 
This suggests that future empirical work should also try to analyze 
separately those bidders that make a number of bids. 

Appendix A 

Proof of Proposition 1 

Parametrically differentiating (2) with respect to z and solving for dxtdi yields 
an expression that is strictly positive, by (3). 


’‘’See Kreps et al. (1982) for a reputation model in which rational players mimic 
irrational ones. Our argument for overbidding contrasts with the model of Leach 
(1988), in which merger bidders make low offers to gain the reputation for being tough 
bargainers. 
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Proof of Proposition 2 

Let us propose as off-equilibrium behavior by shareholders that if a bid of u> 
>0.5-o is made, then shareholders will still mix their actions to satisfy (5). 
Hence, regardless of <0 the bid x = z will satisfy the first-order condition (2). 
Direct calculation verifies that with the proposed probability schedule, the 
second-order condition for an interior global optimum (as well as the first- 
order condition [2]) is satisfied by setting the bid x = z. At this bid, sharehold¬ 
ers are indifferent between accepting and rejecting, which is consistent with 
randomization. 

Only part 3 remains to be verified. Note that for a given x, P(x\ u>) is 
decreasing in w. Regardless of what value for o> is selected by the bidder, his 
optimal bid is x — z. Therefore, his profit on the shares he purchases is zero, 
and his entire gain is due to his gain on the original a shares. He maximizes 
expected wealth by choosing to to maximize the probability of the offer’s 
success. This occurs with the minimum value of u> consistent with obtaining 
control, <0 = 0.5 - a. 


Appendix B 

Unobserved Tendering Costs 

In this Appendix we provide a model in which shareholders possess a com¬ 
mon cost of tendering that is unknown to the bidder. This is meant to de¬ 
scribe, more explicitly, situations in which the bidder does not know perfectly 
the costs and benefits to the target shareholders of tendering. In this case, the 
success of the offer is determined by whether 

x > i(x) + c, (B1) 

where c is the cost of tendering, and i(x) is the shareholders’ evaluation of z 
given a bid of x. The bidder solves the same optimization problem as in the 
text, (1), and therefore has the same first-order condition with respect to his 
bid, (2). Each value of z generates a corresponding optimal bid x(z). Further¬ 
more, in a perfectly revealing equilibrium, each value of z corresponds to a 
different value of x, so that z = i(x). The probability of success is then the 
likelihood that c falls in the range at which (Bl) holds, 

rz-i(x) 

p (*) = j o f{c)dc. 

Differentiating (B2) with respect to x gives 

P'(x) = f{x - £(x))[ 1 - £'(*)] 

Substituting for P'(x) in (2) yields 

w J g f(c)dc = f(x - £(*)>[ 1 - £'(*)K0.5z - xio). (B4) 

For an exogenously .given distribution of c, this integral equation can be 
solved to give the equilibrium inverse bidding schedule £(x), Which from (B2) 
also gives the probability of success schedule. 

We shall develop our analysis under the assumption of a uniform distribu- 


(B2) 

(B3) 
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tion for the tendering cost. We have shown that similar results can be ob¬ 
tained assuming a power density as well.* 1 Substituting the uniform density 

f{c) = y. ce [0. c], (B5) 

into (B4) and letting £(x) = z(x), the inverse of the bidding schedule x(z), gives 
u>(* - z) = [1 - z’(x)](0.5z - xo). (B6) 

Symmetric Information Case 

A special case of interest is the one in which information about z is symmetric 
(i.e., z is known to all target shareholders). We refer to this as the “symmetric 
information” case, even though c is still assumed unknown to the bidder. In 
this case, the inference schedule is S(x) ■ z for all * (equilibrium or not), so that 
£'(x) * 0. Substituting into (B6) gives the symmetric information bidding 
function, 



Note that since w < 0.5, this solution satisfies the fundamental property that 
x' > z, so that (when the tendering cost is nonnegative) there is a positive 
probability of offer success. 

Asymmetric Information Case 

When z is unknown to the target shareholders, the bidding schedule solves 
the differential equation (B6). Let x a denote the asymmetric information bid. 
A boundary condition that the equilibrium schedule must satisfy is that the 
highest-type bidder z must make a maximum bid x" equal to x'{T) of the 
symmetric information case. 

To see why, suppose that the maximum bid x a > 5c 5 . Then it would pay for 
the z type to reduce his bid to x‘ because of two benefits. First, if he were 
viewed as type z, then by revealed preference, since he preferred to bid 5 s to 

under symmetric information, he would still rather do so. Second, he may 
be viewed as a lower type, z < z. If so, his gains from bidding x‘ are even 
greater since his probability of success is greater. 

Suppose, on the other hand, that 3?“ < 3c 5 . Then it would pay for type I to 
raise his bid to 5c 1 , for essentially the same reasons as above. By revealed 
preference, if his type is still viewed as z since he chose x s under symmetric 
information, he will still prefer it here. Second, if changing his bid were to 
lead to a lower inference of his type, his probability of success would rise and 
his gains would be even greater. 

Having established this, we now show that the level of the bid under asym¬ 
metric information for any type below z is smaller than the bid under sym¬ 
metric information. We may rewrite (B6) as 

- (^-> - < ±5 iF 5:L )’ < B8 > 


51 A power density function for costs may be written as f(c) — (I — (1)7* “ 'c P < 1. 



g 22 JOURNAL OF POLITICAL ECONOMY 

where is the derivative of the inference schedule with respect to * under 
asymmetric information. Subtracting (B7) from (B8), we see that 


, ,/0.5z - owe") 

- " “ 2 °( 2 « )' 


(B9) 


In a separating equilibrium, by proposition 1, z' a > 0. The term in parentheses 
is proportional to the bidder's profit from winning, which must be positive. So 
x“ < x‘. It follows immediately from (B2) that in equilibrium a given type has a 
lower probability of success under asymmetric than under symmetric infor¬ 
mation, P?(z) < P?(z). 

Suppressing a superscripts, we can solve the differential equation (B6) by 
the substitution v ■ zlx, which gives a separable differential equation in v and 
x. Imposing the appropriate boundary conditions gives the solution in im¬ 
plicit form of 

(x - 2 )( | -2 o>)/< 4 *>- | )(4 &w _ 2 )-2w/(4<*-1) _ (0.5 - 0})°- ! “ w “ ~ ' )(2(u)' 1 /<4 “ ~ 1 >Z - 1 

(BIO) 

if (a # 'A and 

2«" v, (x - z) = ze x ' 2<x_l> (BU) 

if to = ‘A. 

We have extended this model to include a managerial defensive action that 
imposes costs on the bidder prior to the outcome. In this case, the bidder’s 
objective is 

max (0.5z — xw)P(x) - h(x), 

X 


h‘(x) < 0. Under the specific functional form h(x) = k(x - x) 2 , k > 0, we have 
verified a “compensation effect" similar to that described in the text. This is 
that the cost-imposing measure encourages higher bidding and hence raises 
the probability of offer success. 
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Factor Market Search and the Structure of 
Simple General Equilibrium Models 


Arthur J. Hosios 

University of Toronto 


This paper presents a simple general equilibrium model in which 
unemployed workers search for jobs and vacant firms search for 
employees. Formally, I develop a two-sector, constrained efficient 
version of the Diamond-Mortensen-Pissarides matching model of 
trade coordination. This approach to modeling factor market search 
appears promising since its algebraic development parallels Jones’s 
treatment of the two-sector model of production, and the latter 
framework underlies most applied general equilibrium analyses. 
Some illustrative short-run and steady-state results are presented 
concerning the behavior of open and closed economies that exhibit 
unemployment and vacancies. 


I. Introduction 

Search is a common market activity of practical and theoretical eco¬ 
nomic interest. Indeed, a long and ongoing research program on 
search behavior and its implications for market equilibrium has 
added considerably to our understanding of issues in labor econom¬ 
ics, macroeconomics, and industrial organization. Still, hardly any 
branch of applied general equilibrium analysis has taken these impli¬ 
cations into account; current views on, say, tax incidence and com¬ 
parative advantage basically ignore the allocative issues that arise in 
economies in which, in the absence of an outside price-setting agent, 
buyers and/or sellers must actively search for trading partners. The 
major stumbling block seems to be analytical; by comparison with the 
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perfectly competitive market model that underlies most applied work, 
existing equilibrium search models do not as yet employ a common 
and generally accepted structure and, for the most part, are ill suited 
to comparative static analysis. 1 

This paper does not attempt to model any particular search process 
in detail. Rather, attention is restricted to the general class of search 
models in which the (expected) numbers of buyers and sellers who 
agree to trade and the (expected) division of their gains from trade 
can both be expressed as functions of the numbers of buyers and 
sellers who initially enter the market. The former relationship is 
called a matching technology and plays a role here that is analogous to 
that of an industry production function in conventional analyses; the 
latter relationship is called a surplus sharing rule and determines the 
factor prices of employed agents as functions of the market condi¬ 
tions experienced by their unemployed counterparts. This simple but 
compact representation of a market equilibrium will be used below to 
answer conventional trade and finance questions for economies in 
which, in each sector, unemployed workers search for jobs and vacant 
firms search for employees. 

The idea that an equilibrium search model can be summarized by 
simple functions of the numbers of participants involved underlies 
the bilateral matching models of Diamond (1982, 1984a, 1984ft), Mor- 
tensen (1982), and Pissarides (1984, 1985a, 1985ft). In this paper I 
extend their analyses of factor market search to a multisector econ¬ 
omy and thereby develop a simple and robust analytical tool for mod¬ 
eling search in that setting. This exercise appears promising since the 
resulting framework is almost identical to the one that already domi¬ 
nates most applied general equilibrium work, that is, Jones’s (1965) 
classic description of the two-sector model of production. It also 
nicely complements earlier general equilibrium models of unemploy¬ 
ment by Lucas and Prescott (1974) and Mussa (1978); labor in Lucas 
and Prescott and capital in Mussa are unemployed in transit between 
sectors, which is time-consuming, but they are otherwise fully em¬ 
ployed. In this paper, by contrast, unemployed workers and vacant 
firms can quickly change sectors; the time-consuming activity for 
either type of unattached agent is finding an acceptable trading part¬ 
ner. Hence each sector in my model will be characterized by positive 
equilibrium rates of unemployment and vacancy. 


1 While a limited taxonomy is possible, the search literature nonetheless remains 
frustrating for those interested in applied work: in terms of quantities, some models 
generate a natural rate.,that is too high, others generate one that is too low, and still 
others generate one that is just right. In terms of prices, some models yield a nondegen¬ 
erate equilibrium price distribution, others yield a degenerate distribution at the mo¬ 
nopoly price, and still others yield a degenerate distribution at the competitive price. 
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Attention is further restricted to economies whose equilibrium allo¬ 
cations are always constrained Pareto efficient. Thus, unlike earlier 
research on matching models that seeks to identify and exploit 
sources of inefficiency, the present analysis begins with an economy 
whose equilibrium allocation is known to be constrained efficient. 2 
Furthermore, in contrast to earlier trade and development models 
that generate unemployment by imposing wage rigidities (Bhagwati 
and Srinivasan 1983, chaps. 22, 23), unemployment and vacancies in 
the models that follow do not imply that government policies exist 
that achieve Pareto improvements. 

In effect, a benchmark formulation is established by adapting the 
structure of Jones’s simple general equilibrium model of production 
(in which frictionless factor markets and efficient resource allocation 
are taken for granted) to describe economies in which information 
and transaction costs are important and in which the resulting natu¬ 
ral rates of unemployment and vacancy are positive but efficient. 
Whether uninternalized search externalities are empirically impor¬ 
tant or not, it is dear that their distinct implications for resource 
allocation and income distribution cannot be fully appreciated in the 
absence of a constrained efficient standard for comparison. 

The paper proceeds as follows. Sections II-IV are introductory: 
Section II describes agents’ preferences, endowments, and produc¬ 
tion technologies; Section III reviews the basic one-sector matching 
model of search; and Section IV describes a two-sector, two-factor 
economy in which the allocation of labor and capital in each sector is 
governed by the type of matching process introduced earlier in Sec¬ 
tion III. Sections V and VI then describe this model’s basic short-run 
and steady-state properties. 

The distinction in this paper between short-run and steady-state 
behavior is novel. The standard neoclassical short-run model is a full- 
employment, specific-factor model in which, given some initial alloca¬ 
tion of resources, labor is freely mobile between sectors while capital is 
(temporarily) immobile (see, e.g., Mayer 1974; Mussa 1974; Neary 
19786). In this paper, by contrast, unemployed workers and vacant 
firms are freely mobile between sectors while the initial stocks of 
employed labor and employed capital in each sector are (temporarily) 
fixed. The key assumption here is that employed agents cannot search 
on the job but must quit trading relationships to find new partners. 
With wages adjusting to keep employed agents better off than their 
unemployed counterparts, it never pays to quit, and, as a result, un¬ 
employed agents are effectively more mobile in the short run. 

* Diamond (1984i) and Mortensen (1986) survey the matching literature, highlight¬ 
ing the sources and implications of unintemalized search externalities. 
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Specific trade and tax applications of the model are described in 
Sections VII and VIII. Since sectoral matching technologies play a 
central role in these analyses, Section IX considers the general prob¬ 
lems of deriving and estimating these technologies. 

Before starting, I should note that Davidson, Marlin, and Matusz 
(1987, 1988) have independently developed two-sector models that 
also use the Diamond-Mortensen-Pissarides approach to modeling 
factor market search. While the present paper examines many of the 
same tax and trade issues considered in Davidson et al. (1988), the 
analysis and results described below are quite different because of two 
special assumptions. First, Davidson et al. assume that the surplus 
sharing rule applied to each match is a given constant, which implies 
that the allocation of resources in their models is almost never con¬ 
strained efficient. By contrast, the endogenous surplus sharing rules 
in this paper respond to demand and supply in a manner that allows 
for the complete internalization of any search externalities. 

Second, Davidson et al. assume that agents’ common discount rate 
is zero, which implies that short-run adjustment paths are irrelevant. 
With the introduction of a positive discount rate, however, it will be 
seen that short-run and steady-state behavior are quite distinct and 
that, perhaps most interesting, the analytical focus of the two-factor 
model shifts from a conventional description of the factor prices of 
employed labor and capital to a description of the asset values and 
permanent incomes of unemployed workers and vacant jobs. 


II. Basic Assumptions 

The following assumptions are maintained throughout this paper. 

Consumption goods .—Two nonstorable consumption goods, x t and 
* 2 , are bought and sold in perfectly competitive and frictionless com¬ 
modity markets and are produced using fixed supplies of the two 
inputs, labor and capital. There is no store of value in this economy. 

Factor supplies .—There is a total population of n + k individuals: of 
these, n are each endowed with one unit of labor and are called 
workers, and k are each endowed with one unit of physical capital and 
are called either firms or entrepreneurs, depending on their roles as 
either employers or consumers. Both n and k an? fixed but large 
numbers. /' 

Production technology .—The available technology is quite simple: 
each firm can employ only one worker, and any given worker-firm 
pair can together produce only one unit of output per period, that is, 
one unit of either ** or x 2 , depending on the sector in which this 
worker-firm pair is located. ! 

Letting n* and A, denote, respectively, the numbers of workers and 
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n = «i + n 2 , k = ki + k 2 . (1) 

Furthermore, if we let u, and v, denote, respectively, the numbers of 
unemployed workers and vacant firms in i, it follows that 

x, = e, = n, - u, - k, - v„ i = 1,2, (2) 

since the number of employed workers in i, e„ is identically equal to 
the number of filled jobs in that sector. 

Preferences. —Workers and entrepreneurs have identical homo- 
thetic preferences that exhibit interminability of consumption of 
either produced good whenever income is positive; that is, given 
prices {p\,p 2 } and income y, their common indirect utility function is 
represented by fy(pi,p 2 )y, where 4> exhibits the usual properties. With 
these preferences, moreover, relative aggregate demands will be a 
function only of relative prices; that is, x\/x 2 = f{p\!p 2 )> where/' < 0. 
As the only source of earned income is the production and sale of 
output, I assume that unemployed workers and unemployed entre¬ 
preneurs simply enjoy some fixed utility that is normalized to zero. 
All agents employ the common discount rate r. 

Mobility between sectors. —Unemployed workers and vacant firms can 
move costlessly and instantaneously between sectors, whereas em¬ 
ployed workers and filled firms are jointly immobile. In other words, 
given any attached worker-firm pair in sector i, if one or both trading 
partners wish to move to sector ;, they must first break the match (i.e., 
quit) to do so. 

Mobility within sectors .— Associated with each sector is a matching 
process that brings together the unattached buyers and sellers located 
in that sector and determines their terms of trade. In particular, any 
unattached agent in sector i can locate a trading partner in i only by 
participating in i s matching process. 

To complete the model, I need only specify the matching process 
available in each sector of this economy. An obvious candidate is the 
matching process coordinated by a Walrasian auctioneer; this particu¬ 
lar process has of course been studied at great length and represents 
the accepted “as if’ description of how resources are allocated when 
transaction and information costs are economically unimportant. 
When these costs are significant, however, the process by which 
buyers and sellers are brought together is generally viewed to be one 
in which the members of one or both of these groups actively search 
for trading partners among the other group and in which individual 
search takes time and is not always or immediately successful. 

The goal of this paper is to identify the aggregate implications of 
these search processes. To that end, I adopt a simple version of the 
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recent bilateral matching models due to Diamond (1982, 1984a, 
19846), Mortensen (1982), and Pissarides (1984, 1985a, 19856). The 
main advantage of their approach to modeling trade coordination 
and price setting in the absence of an auctioneer is its analytical sim¬ 
plicity: it avoids the micro details of search and strategic behavior 
while providing a relatively straightforward and general way to cap¬ 
ture some important macro features of economies with trade frictions 
and imperfect information, such as unemployment and vacancies. 

III. A Matching Model 

This section introduces the basic Diamond-Mortensen-Pissarides one- 
sector matching model and describes its efficiency properties. In this 
model, unemployed workers and vacant firms are brought together 
pairwise by a given stochastic matching technology, and once to¬ 
gether, their terms of trade are determined instantaneously by a 
given joint surplus sharing rule. 3 Later sections extend this analysis to 
a two-sector setting. 

Matching. —Let m(u, v) denote the rate of pairwise matching that 
occurs among u unemployed workers and v vacant firms. (Each of 
these matches results in the start of production.) The function m 
represents the aggregate outcome of some underlying search process 
in which workers try to find jobs and firms try to hire workers. At an 
individual level, each unemployed worker experiences the arrival of 
jobs as a Poisson process with arrival rate mlu, and each vacant firm 
experiences the arrival of potential employees as a Poisson process 
with arrival rate m/v. Given n workers and k firms, recall that u = n - 
e and v = k - e, where e is total employment. 

Separation. —Let 6 denote the exogenously given rate at which exist¬ 
ing jobs break up. This creates a flow, be, of workers who lose their 
jobs and of firms that lose their employees. As a result, the rate of 
change of total employment is 

e = m(n - e, k - e) - be, (3) 

where m(u, u) is the corresponding flow out of unemployment and 
vacancy: in steady-state equilibrium e = 0. 4 

Lifetime incomes. —Let W, and W u denote the expected present value 
®f lifetime income of employed and unemployed workers, respec¬ 
tively. Thus W e and W u represent a worker’s asset value when em¬ 
ployed and unemployed: for any given marginal utility of income <6. 

* In effect, a matching-technology and a sharing rule "will be added to the list ol 
primitives that characterize the economy under consideration. 

A The assumption that b is fixed simplifies the analysis but is not essential; the as¬ 
sumption that h is positive is necessary to generate steady-state unemployment. 
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4>W, and 4>W U represent the corresponding expected present values of 
lifedme utility. 

Let w denote the wage paid to employed workers. The standard 
asset-value equation tells us that an employed worker’s permanent 
income, or yield, rW e> equals the sum of his wage w plus the expected 
capital gain due to a change in status from employment to unemploy¬ 
ment, b(W u - W e ), given the transition probability b, plus the appreci- 
adon or change in asset value W e . Similarly, for unemployed workers, 
rW u equals the sum of the expected capital gain, (m/u)(W e - W u ), 
where mlu is the transition probability from unemployment to em¬ 
ployment, plus the change in asset value W u . That is, 5 

rW e = w - b(W e - W u ) + W t , (4a) 


rW u = — (W, - W u ) + W u . (4b) 

u 

Letting Wf and W t , denote the expected present value of lifetime 
income of filled and vacant firms, respectively, I also have 

rW f = (p - w) ~ b(Wf - W v ) + W f , (4c) 

rW v = i (Wj - W v ) + W,„ (4d) 


where p is the price of output. Observe that 


w x = ("liT 1 )* = ^ ~ bex = e ’ u, f’ v ’ ^ 

and hence W x = 0 in steady-state equilibrium. 

Surplus division. —A worker’s net surplus from securing a job equals 
W t - W u , and a firm’s net surplus from hiring a worker equals Wf - 
W„: from (4a) and (4c), 


W, - W u 


w - rW u + W, 
r + b 


Wf - W v 


p - w — rW v + W f 
r + b 


5 Though omitted here, a nonpecuniary search cost s u could easily be introduced, in 
which case 


rlv “ = ~(j) + (rK' + 

Hence I also ignore the possibility that an unemployed worker could increase the 
probability of changing status by searching more intensively, e.g., by choosing a larger 
V Later results do not rely on either of these simplifying assumptions (see Hosios, in 
press). 
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Let 2 denote the total surplus created by a worker-firm match. Then 
these expressions give 

Z = W' + W f -W u -W v 

_ p - rW u - rW v + W e + W f (6) 

r + b 

It is assumed that the worker in any given worker-firm pair receives 
the fraction 3 of their joint surplus, 2, so that W e - W u = 32 and 
Wf - W v = (1 - 3)2. Therefore, from (4b), (4d), and (6), it follows 

that 


rW u = 32 + W u , 

(7a) 

rW v = -^-(1 - 3)2 + W,„ 

(7b) 

p + W r + W r - W u - W v 
+ b + (mlu) 3 + (mlv)( 1 - 3) 

(7c) 


Observe that 3 need not be constant and could as well be some func¬ 
tion of the numbers of participants on both sides of the market, say 
3 = P(«, v). 6 

Steady-state equilibrium. —If we take the supplies of labor and capital, 
{n, k}, as given, e = 0 can be solved for the steady-state employment 
level, e, and hence for the steady-state unemployment and vacancy 
numbers, u = n — e and v = k — e. Now that we have determined the 
transition probabilities m/u and mlv (and the sharing rule when it is 
endogenous as well), (4), (7), and W x = 0 can be straightforwardly 
solved for agents’ permanent incomes. 

Efficiency .—Let Q denote the present discounted value of output 
where there are n workers, k firms, and an initial employment level c 0 ; 
specifically, 

Q = \ t rt [e{t)]dt subject to e - m(u, v) - be, 

Jo 

and e(0) = eg. Following Diamond (1980), I derive expressions below 
for the change in the present discounted value of aggregate output, 
due to derivative changes in the supplies of labor aiad capital, along 
the convergent path from one steady state to anothtsr. That is, evalu- 

* This description of workers' surplus share plays the same role as the price-taking 
assumption in competitive markets; i.e., individual workers and firms view p as given 
and fixed even thougir to equilibrium value can be determined by the numbers of 
buyers and sellers in the market. For example, if P = P(u, u), where P,, < 0 and P„ > 0, 
then workers’ surplus share is a decreasing (increasing) function of the supply of 
unattached labor (capital) to the market. 
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ating the derivatives of 




de 


[m(«, v) — be] 


at e = 0 gives 


where 


r iQ = m ig. m ift 

an “ de ’ 6k v de ’ 


(8a) 


(8b) 


3Q = 1 

de r + b + m u + m„' 


(8c) 


Observe that dQ/de represents the joint social product of an additional 
employed worker-firm pair, while r(3Q/dn) and r(dQldk) describe the 
separate flow contributions of an extra unemployed worker and an 
extra vacant firm, respectively. 7 

The only explicit decision made by workers and firms in this model 
is whether or not to participate in the given matching process. (These 
decisions will later determine the direction of intersectoral factor mo¬ 
bility.) For each type of unattached agent, the incentive to enter will 
be efficient only if that factor’s social marginal product equals the 
private return from participating in the given matching process; that 
is, necessary conditions for efficient labor and capital mobility are 

< 9 > 


respectively. 

Substituting from (7a) and (7b), evaluated at a steady-state equilib¬ 
rium, and from (8b), we can write these efficiency conditions as 


. dQ Q m x 
pm u •— = 3 — Z, 
r de u 




In turn, substituting for 2 and dQJde, from (7a) and (8c) at e — 0, we 
have the following lemma. 

Lemma 1. Efficient factor mobility results only if the matching tech¬ 
nology and the sharing rule together satisfy 8 


Q WI 2 | Q V 1ft 

m u = P—, m v = (1 - (3)—. 


( 10 ) 


7 For example, since an extra unemployed worker changes the matching rate by m. 
while an additional match contributes dQ/de, the combined marginal effect of entry by 
labor is simply m u (dQlde). 

8 These elasticity conditions are quite general and in fact exhaust the efficiency im¬ 

plications of the current matching literature (see Hosios, in press). 
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Three features here are noteworthy: (1) Efficient factor mobility 
implies equality between the private gain and social contribution of an 
additional worker-firm match, 9 10 that is, 


1 



( 11 ) 


(2) A matching technology that exhibits constant returns to scale, 
where m = um u + vm v , is necessary, at least locally, for efficient re¬ 
source allocation. (3) Given constant returns in matching, (10) defines 
the sharing rule, P(u, v ) = umjm , that internalizes any entry or exit 
externality. 


IV. A Two-Sector Search Model 

The remaining sections of this paper examine a two-sector economy 
in which the allocation of resources in each sector is governed by the 
type of matching process described in Section III. In particular, if we 
take output prices {pi, p%) and factor supplies {n, k} as given, equi¬ 
librium is described by 

n, + n 2 = n, ki + kz = k, (12a) 

rW' u = rWl rWl = rW*. (12b) 

in addition to the sector-specific counterparts to equations (3)-(7)." 1 
The terms rW‘ u and rW\, are the permanent incomes of unemployed 
workers and vacant firms in sector i, respectively; these in turn de¬ 
pend on the corresponding levels of unemployment and vacancy, u, 
and v„ as well as on the matching technology and sharing rule em¬ 
ployed in sector i, m'(u„ v,) and (J. 1 To simplify, the same separation 
rate, b, is assumed to characterize turnover in both sectors. 

All agents in this economy attempt to maximize the expected dis¬ 
counted value of their lifetime incomes. Since unemployed workers 


9 Given (7c) and (8c) at i = 0, (10) implies (11); i.e., efficient exit by each factor 
separately implies efficient exit by both factors jointly (but not vice versa; i.e., (11] does 
not imply [10]). 

10 Note that the two-sector counterpart to (5), describing the appreciation of agents' 
asset values, is 


w; 


aw; 

Be i 


«i + 


aw; 

d?2 


e 2- 


11 I also assume that the matching technologies, m' and m 2 , differ. This has two 
interpretations: Either the resource allocation processes are actually different across 
sectors, say between rural and urban areas or between agriculture and manufacturing, 
or these processes are the-same but the sector-specific production functions are differ¬ 
ent (and allow substitution between employed labor and capital). Therefore, the differ¬ 
ent m' are the simplest way to model the sector-specific concatenation of these produc¬ 
tion and matching processes. 
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and vacant firms are freely mobile between sectors, their permanent 
incomes must be equal in equilibrium; hence (12b). 12 In the small- 
open-economy version of this model, output prices {pi, pa} are 
specified exogenously; in the closed-economy version, the relation¬ 
ship between the quantities demanded of each good and relative com¬ 
modity prices, e]/e? = /(pi/p 2 ),/' < 0, completes the model. 

The main distinguishing assumptions in this paper follow. 

Assumption 1. The matching technology in sector i, m'(u„ v,), ex¬ 
hibits constant returns to scaie, 

m‘ = u,m' u + ViirC,,, i = 1,2. (13a) 

Assumption 2. The surplus sharing rules satisfy (10), so that 

P’(«., Vi) = *=1,2. (13b) 

m 

It will become clear as we proceed that the roles played by the 
matching technology in (13a) and the sharing rule in (13b) are basi¬ 
cally the same as those played, respectively, by the linear homoge¬ 
neous industry production function and the perfectly competidve 
price-setting mechanism in conventional general equilibrium analy¬ 
ses. 

The general class of matching processes considered below is also 
assumed to satisfy the following assumption. 

Assumption 3. The matching technology in each sector, m'(u;, v,), 
i = 1, 2, is twice continuously differentiable and has well-behaved 
(smooth) isoquants; the derivatives m‘ u and m!„ are positive whenever u, 
and v t are positive; and the marginal rate of substitution in matching 
between unemployed workers and vacant firms, m‘ v /m' u , is a strictly 
increasing function of the unemployment-vacancy ratio. 

At certain points, however, the reader may wish to compare the 
allocation of resources that results under assumption 3 with the allo¬ 
cation that would result when an auctioneer adjusts the sharing rule 
to clear each factor market. The matching technology and sharing 
rule that capture the latter situation are described below in (14). 

Definition. The Walrasian matching technology is given by 

m(u, v) = min(u, v), (14a) 

where m u (x, x) = 0. 

In equilibrium, the Walrasian auctioneer will assign the entire joint 
surplus to the scarce factor in the market. To confirm that this well- 


12 As employed workers and filled firms are immobile, however, a two-sector equilib¬ 
rium cannot place any direct restriction on the relative magnitudes of permanent 
incomes rW) and rW 2 , or rWj and rWf, or even the corresponding wages toi and tv s , 
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known sharing rule is efficient if and only if the matching technology 
is Walrasian, combine (13b) and (14a) to derive 


um u _ [1 u < v 
m [0 u 2: v. 


(14b) 


For the record, we have the following lemma. 

Lemma 2. When the matching process in each sector is Walrasian, as 
noted in (14), the economy’s production possibility frontier is linear, 
with slope - I; only one type of agent (the one in aggregate excess 
supply) experiences unemployment; all unemployment is voluntary 
(e.g., if n > k, fi‘ = 0 so that W‘ u = W, = 0); and the distribution of 
unemployed agents between producing sectors is inconsequential. 13 

These features are noteworthy mainly because they are absent be¬ 
low. 


V. Commodity Prices, Permanent Incomes, 
and Output 

The relationship between commodity prices and unattached agents' 
permanent incomes, under assumptions 1 and 2, is the workhorse of 
the model and is derived in the Appendix, Section A. (This relation¬ 
ship and all functions described in this paper are evaluated at the 
same initial steady-state equilibrium.) 

Proposition 1. In each sector, the permanent incomes of unem¬ 
ployed workers and vacant firms, the discounted stream of output, 
and the price of output, {rW u , rW' v , Q„ p,}, satisfy 

(n, - pe,)(rW„) + (k, - p e,)(rW' u ) 

(15a) 

= pi(rQ_, ~ pc,) + ( A\ + B',)Q, + (dfe + B‘ 2 )Q 2 , 

where 


e,,S;p.-t-j, 


4-<»Hsh 


«- 


m' dp, 
r + b de ; ' 


(15b) 


1 With a Walrasian matching technology, the term + m„ is constant and indepen¬ 
dent of u and v. Therefore, from (8c), the joint social product of an extra employed 
worker-firm pair is also constant and equal across sectors, i.e., dQ,/de, = dQ 2 /de 2 ; hence 
iQi/dn, = dQg/dni and dQ,/dk, = dQ_ s /dk 2 . Given prices and the sharing rules in (14b), 
the match surplus in sector i, S' (i = 1, 2), will be constant and independent of the level 
of employment, e 1 ; furthermore, redistributing unemployed agents between sectors 
leaves unchanged the rate of matching m', the employment level e„ and all permanent 
incomes. 
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As in standard neoclassical analyses, (15a) requires that (i) unit costs 
must equal market prices in a competitive equilibrium (except that 
rQ, - p e t is the relevant output measure, while n, - pe, and A, - pe, are 
the relevant labor and capital inputs), and (ii) each (unattached) factor 
must be paid the value of its social marginal product. That is, since the 
matching technology in (13a) exhibits constant returns with respect to 
Ui = n, - e, and v, = k, - e„ rQ, in (8a) must exhibit constant returns 
with respect to e„ n„ and k,\ hence 


rQ, = r^e, + ^m' u n, 4- ^ 
de, de t de. 




and, with (8c), this gives 


rQ, - pe, = m‘ u (n, - pe,) + 4^ m' v (k, - pe,). (16) 


In effect, (16) is the steady-state “production function” underlying 
(15a); that is, multiplying by p, and substituting rW' u = pim‘ u (dQ,lde,) 
and rW‘ v = p.mKdQj/de,) into (16) yields (15a) evaluated at m‘ = be,. 

The remaining terms on the right-hand side of (15a) are nonzero 
only along the equilibrium adjustment path between steady states. 
These terms capture the short-run impact of changes in the rates of 
employment change, e x and e 2 , on price and surplus appreciation in 
each sector, p, and I 1 (i = 1, 2). The latter terms are important in the 
short run because, as shown in the Appendix (Sec. A), the total asset 
appreciation of attached labor and capital in t, e,(W‘, + \V|), depends 
separately on p, and S', while the total asset appreciation of unat¬ 
tached labor and capital, u,W' u + v,W\„ depends only on 


VI. Short-Run and Steady-State Analysis 

In this section I describe the model’s short-run and steady-state prop¬ 
erties. The distinction is straightforward. In the short run, the em¬ 
ployed factors of production are immobile. The levels of employment 
in each sector remain fixed {de, = 0), while the rates of change of 
employment may become nonzero (de, ^ 0) as unattached agents are 
reallocated along the convergent equilibrium path to the new steady- 
state solution. Across steady states, however, the levels of employment 
adjust completely while the rates of change of employment are fixed 
at zero. 

To illustrate, consider the output measure rQ, = e, + (dQJde^e, in 
(8a). Starting in a steady-state equilibrium, where e, = 0 and rQi = e, 
represents the initial level of output, suppose that the supplies of 
labor and capital in sector i change by dn, and dk,, respectively. In this 
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dt( + 


•SiH, - + <*> - £« + ?*•; . 

dei de t r + b + < + m' v 


In the short run, additional workers (firms) are all unemployed (va¬ 
cant), so that du, = dn t (dv, = dk,)\ thus if we take as given the initial 
level of employment, e„ and the corresponding flow into unemploy¬ 
ment, be,, the resulting short-run change in the discounted stream of 
output is due only to the consequent change in the rate of matching. 

Since m(n - e, k - e) = be holds in steady-state equilibrium, how¬ 
ever, the corresponding steady-state change in output is 


de , + 


^ de, = dti = 
dti 


m' u dn, + m' v dki 
b + m!u + m\,' 


If we compare the right-hand-side expressions above, it follows that 
the short-run output change is smaller in absolute value than the 
steady-state change whenever the discount rate is positive and that the 
former approaches the latter as r goes to zero. In words, short-run 
adjustment paths are relatively unimportant when agents are patient. 


A. Short-Run Analysis 14 

To simplify, the following additional notation is adopted. Let 
Xi = rQ, - pe„ 

bn, + ru, 


N, = n, - pe, = 
R, k, pe, r —- 


r -f b 

bki + rv, 
r + b ’ 


and let the corresponding input-output coefficients be denoted by a N , 
= N,IX, and a K , = K,IX,. Hence (12a), (12b), and (15) yield 

ajvtX, + a N2 X 2 = N, (l 7a ) 

+ a K2 X 2 = K, (l 7fa ) 


14 This subsection describes the short-run impact of an unanticipated permanent 
change in either the current price received by producers or the current aggregate 
endowment of some factor. While the short-run impact of unanticipated changes, 
permanent or temporary, in the future values of these variables is also of interest, the 
framework required Uxfefitribe these dynamic effects is beyond the scope of this paper 
(see Judd 1985, 1987). Of course the steady-state impacts of both current and future 
permanent changes in these variables are the same. 




^* + (a| + Bi^r' 

+ (At ^ 


+ * ** + (a! + B *)Qi x * 1 

+ (Af + Bl^aX* 1 . 
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(17c) 

(17d) 


N = n - [r(«i + «sV(r + b)] and K = k - [r(e, + « 2 V(r + 3)]. 
The terms rW u * rW’* and rW t * rW‘„ (i * 1,2) represent the equilib¬ 
rium permanent incomes, in both sectors, of unemployed workers 
and vacant firms. 

With (17), the short-run changes in each sector’s discounted output 
stream and in unattached agents* incomes in response to changes in 
commodity prices and factor endowments are determined in the con¬ 
ventional manner. Let £ * dill denote the relative change in z. If we 
start at a steady-state equilibrium with given initial employment levels 
and e 2 and recognize that (16) implies 


dX, = dQ, = ^ (m'udN, + mldK,), 


i = 1,2, 


( 18 ) 


the following equations of change are derived straightforwardly from 

(17): 

kjvi^t + Kv 2 -X 2 = N - + k m a N2 ), (19a) 

+ kiraXa — k — (k^jdjd + \k 2 & k2 ), (19b) 

«Ni(rVK) + e K ,(rW„) = p y + (C| + Di)X, 

- (19c) 

+ (C 2 + D 2 )X 2 — (Oni^m + 0/adjvi), 

0 v 2 (rW u ) + e K2 (rW„) = p 2 + (C‘f + D^Xi 

+ (C 2 + D|)X 2 - (9,v 2 d N2 + 6 K2 dK2), 

where 15 


<?- 


m 


SI 1 


px 3e, 


X,’ 


JX TV = 
S* Y’ U 1 


m 


1 3p, X, 


r + b pi dej X,’ 


i,j = 1, 2. 


The terms \ Nx and \ K> denote the weighted-average labor-unemploy- 


15 Notationally, rW, is the percentage change in rW„. To get the (C( + D ; ,)X, terms, 
observe that the change in (A( + B’,){dQ_Jde,)(m' - be,) equals 

(A J , + + mtjik,) = (A 1 , + , + m'JK,) 

= (A ! , + ffjdX,. 
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ment and capital-vacancy fractions in sector i, respectively: 

brij + ru, _ bk, + rv, 

N ‘ bn + r(u\ + « 2 )’ Kt bk + r(v\ + i> 2 )’ 

and 8.v, = ( rW u )Ni/p,Xi and 6*, = ( rW v )K,lp,X , effectively represent the 
value shares of labor and capital in sector i, respectively. 

Following Jones (1965), I observe that when the matching process 
in each sector is constrained efficient, as per (13), unemployed work¬ 
ers and vacant firms each earn their social marginal products. This 
implies, from (18), that 

8,v4,vi + ^KidfCi = 0, i = 1,2; (20a) 


that the relevant isoquant slope in each sector is equal to the ratio of 
factors’ permanent incomes; 16 and therefore that the short-run elas¬ 
ticities of substitution between K, and N„ and hence between vacancies 
and unemployment, can be defined as 


or, = 


a Nt 


rW u - rtf,,,’ 


i — 1,2. 


(20b) 


Thus, solving (20) for the d./s and substituting into (19), I obtain the 
following proposition. 

Proposition 2. Starting in a steady-state equilibrium, the short-run 
changes in the discounted stream of output (and hence employment) 
in each sector and in unemployed workers’ and vacant firms' perma¬ 
nent incomes, in response to a current change in a price or factor 
endowment, satisfy 

XwA + X/vr 2 X 2 = X' + 5/v(rVP u — rW v ), -(21a) 

+ ^* 2-^2 = R ~ 8jv(rlT u — rVt’,,), (21b) 

0Ni(rtf u ) + e*,(rtf„) - p, + (C, 1 + Dj)rf, + (C\ + Dl)X 2 , (21c) 

e*2(rtfj + 0* 2 (rWg = p 2 + (C? + D?)X, + (C f + DfrX 2 , (2 Id) 

where 8, = X,i8iicr, + X ;2 9i2<i2 (i, j = N, K, i ¥■ j). In a small open 
economy, C 2 = C? = D) = 0, i,j =1,2, and the p, are given exoge- 


*® That is, 

dXjldNj = m| .(dQJde,) 
dXi/dK, = ml(dQ,lde,) 

fP’m'/u,)!' 
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nously since prices are determined in world markets and 17 dVItei < 0, 
C\ < 0. In a closed economy, p\ = p 2 since employment and hence 
market supply are fixed in the short run. 

Observe that industries are ranked in the short run in terms of the 
numbers of employed and unemployed workers and firms located in 
each sector. Specifically, sector t is said to be “labor-unemployment 
intensive” relative to j, and sector j is correspondingly said to be “capi¬ 
tal-vacancy intensive” relative to i, whenever 

bn, + ru{ bn, + ru t 
bk, + rv, bkj -I- rVj' 

Letting |X| = X/viX* 2 ~ X* 1 X /V2 and |6| = 0/vi®x2 - 0 jv2®a 1 denote the 
determinants of the matrices of coefficients in (21), observe that sign 
|X| is positive (negative) whenever sector 1 (2) is labor-unemployment 
intensive and sign|0| = sign|X|. 

B. Steady-State Analysis 

The steady-state properties of the basic 2x2 matching model are 
determined by (17) and be, = m(u„ v,), i = 1,2. Let a, V( = nje, and a*, 
= k,le, denote the steady-state input-output ratios in i, so that 


a N ,ei + a N2 e 2 = n, (17a') 

“xKi + 0-K2 e 2 — (17b') 

«Ni(rW u ) + a Kl (rW v ) = p u (17c') 

ay 2 (rW u ) + a K2 (rW v ) = p 2 . (17d') 


To derive the steady-state counterparts to (21) we need some addi¬ 
tional notation and definitions. 


17 Drop sector-specific notation. To show that dllde < 0, substitute (7a), (7b), and (IS) 
into (6) to get 


1 



l <K> as 

r + b de de 


(m - be). 


where dQlde - l/(r + b + m u + m„) and Sildt = 9(W, + Wj — W u - WJ)lde. Let F(e) » 
bQJie and, with p fixed, take the derivative of the equation above to get 

-#[■ + r* 

so that sign dilde = sign F‘(e). where T' = -F i d(m u + mj)de. Under assumption 3 we 
can write the matching technology asm - vG{ulv), where C’ > 0 and G" < 0, so that 

d(m u + m„) _ C"[l - (u/v)] 3 > 0 


and hence f > 0. 


v 
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First, let \ Nl = njn and \ K , = kjk denote the factor proportions in 
sector i. Second, solve be, = m‘(n, - e„ k, - e,) for the steady-state 
employment levels 

e, = M'in,, k,), i = 1, 2, (22) 

where M‘„ = m'J(b + ml,, + rtij) and M' k = m'J(b + m‘ u + mj,). Third, 
define b Ni = and = k,M l k le„ where §*,, + = 1. Finally, 

since M'JM\ = = rWJrW v , define the steady-state elasticity of 

substitution between labor and capital as 


<r, 


&k< ~ 

rVt'u - rW„’ 


i = 1, 2. 


Our main result then follows (see the Appendix, Sec. B). 

Proposition 3. The steady-state changes in employment in each 
sector and in unemployed workers’ and vacant firms’ permanent in¬ 


comes satisfy 

Avi^i + = A + S /V (rVV„ - rVV„), (21a') 

Wi + Wi = * " - rtf„), (21b') 

e*,(rtt u ) + 8*,(rlV„) = p h (21c') 

6^2 (rW u ) + 0 K2 (rW v ) = p 2> (2 Id') 

where S, = Ayi0,jcr, + (i, j = A, K, i * j). In a small open 

economy, the pi are given exogenously; in a closed economy, 

e\ ~ h = ~vnip\ ~ fc)’ ( 23 ) 


where tr n is the aggregate elasticity of substitution in demand. Notice 
that the determinant |X| = A lV i - k Kl \ N2 has 

,ign|t| - sig„(5l - g). 

Therefore, the steady-state factor intensity ranking of industries can 
differ depending on whether that ranking is based on their labor- 
capital ratios and hence on the sign of |A| or their labor-unemploy¬ 
ment to capital-vacancy ratios and hence on the sign of |0(. 18 


18 When these rankings differ, |0||X| < 0 and (21') imply, e.g., that an increase in the 
relative price of a commodity can lead to a decrease in its steady-state output and that 
the imposition of a price-raising tariff to protect a labor-intensive industry can make 
unemployed workers worse off. These paradoxical results are due primarily to the fact 
that agents discount the future and that turnover and frictions that make search an 
individually time-consusBing activity are both permanent features of the economy. 
Similar results occur in standard full-employment models with factor market distor¬ 
tions (e.g., a wage differential between sectors) when the physical and value factor 
intensity rankings of sectors differ (see Bhagwati and Srinivasan 1971; Jones 1971; 
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C. A Special Case , 

I conclude this section by confirming that the short-run and steady- 
state analyses developed above remain distinct only when agents are 
impatient and are less concerned with the long-term consequences of 
contemporaneous disturbances. 

Proposition 4. In the limit as the discount rate r goes to zero, the 
short-run and steady-state changes satisfying (21) and (21'), respec¬ 
tively, coincide. 19 

VII. Applications to the Theory of 
International Trade 

On inspection, equations (21) and especially (21') are notationally 
very similar to the equations of change that characterize the perfectly 
competitive 2x2 neoclassical model of production, specifically, 
equations (lb)—(4b) in Jones (1965). Thus readers familiar with appli¬ 
cations of the latter framework to international trade (Jones and 


Magee 1971). In this setting, Neary (1978a) argues that situations in which these factor 
intensity measures differ can be ruled out as dynamically unstable. If we proceed 
likewise here, this amounts to substituting 

n, = F(rW' u - rWl), A, = G(rW}, - rW„ 2 ), F', G' > 0, (24) 


for_(12b) in the steady-state model; it is easy to verify that (24) is (locally) stable only if 
|6||X| > 0. Introducing (24) in this manner makes economic sense, however, only if the 
movement of unattached labor and capital between sectors is considerably more slug¬ 
gish than the matching of unemployed workers and vacant firms within sectors. In fact, 
of course, the exact opposite is true here. As a result, it only makes sense to add (24) to 
the short-run model in which the corresponding stability condition, |A||X| > 0, always 
holds. 

19 From their definitions, it is immediate that {X,, N } , K p a tJ , \ ip a„ B ; ) approach {«,, n p 
k } , a,,, X,y, <r„ 8,} and that {p, C!) approach {0,0} as r-* 0, which establishes the result for 
small open economies. For closed economies, we have 


lim D'j = 

r —O 


m‘ _ 1 _ dpi 
b p. Be, e. 


e j d P< 

p, de } ’ 


so that the right-hand sides of (21c) and (2 Id) become 

j + lip-e, + e J.p.i s = 

p\ dfi p\ df 2 


p + L 

Pi pi fei 




where p is the common short-run percentage change in the prices of goods 1 and 2. 
Therefore, the short-run and steady-state closed-economy responses coincide in the 
limit if ii and £* satisfy (23), i.e., if - £? = _ (*i - hV^D < t0 confirm the latter result, 
differentiate e,/** = f(pilpi) to get 


_i_. Ait 
<j d p, a«, 




1,2 0 ^ 1 ). 
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Neary 1984) and public finance (Atkinson and Stiglitz 1980, chaps. 6, 
7) will expect correspondingly modified results to hold here as well. 

This section identifies the distinct implications of factor market 
search for some illustrative trade issues. 20 Specifically, we compare the 
short-run and steady-state consequences, for unemployed workers 
and vacant firms, of two developments. In one situation, the relative 
price of a commodity rises because of, say, the imposition of a tariff to 
protect an import competing industry; in the other situation, factor 
endowments change because of either increased immigration or di¬ 
rect foreign investment. 

Throughout, consideration will be limited to a small price-taking 
economy, satisfying = C 2 = Dj = 0 and C\ < 0, that is incompletely 
specialized in the production of either good. 


A. Price Changes 

Equations (21c') and (2Id') yield the following Stolper-Samuelson- 
like, steady-state result: An increase in the relative price of the good 
produced in the sector whose matching process is labor-unemploy¬ 
ment (capital-vacancy) intensive raises the real steady-state perma¬ 
nent income of unemployed workers (vacant firms) and lowers the 
real permanent income of vacant firms (unemployed workers). The 
idea is that the expanding industry absorbs relatively more of the un¬ 
attached factor that the contracting sector’s matching process uses 
intensively, and this increases the probability that the expanding sec¬ 
tor’s intensively used unattached factor will find a trading partner. 
This result is the search counterpart to Jones’s (1965) well-known 
“magnification effect” of price changes; that is, if pj > p 2 and sector 1 
is labor-unemployment intensive, then |0| > 0 and rW u > p x > p 2 > 
rW v . Of course, the interesting conflicts of interest with respect to 
pricing (tariff) policy are not simply between unattached workers and 
vacant firms, as suggested by the proposition, but will also include 
possible alliances with employed workers or filled firms. 

Compared with this steady-state response, employment levels ad¬ 
just only partially and slowly in the short run as unattached agents 
move between sectors in response to the change in the relative surplus 

20 An immediate implication of factor market search for open economies is that the 
capital and labor services embodied in net exports will generally include both the ser¬ 
vices of employed capital and labor and the services of vacant jobs and unemployed 
workers. That is, when factor markets entail frictions and infrastructure is allocation- 
ally important, the factor services that are built into tradable commodities include both 
those that go directly intQ the production of output and those that are involved in 
bringing the factors of production together to form producing units. By contrast, 
conventional analyses impose frictionless markets and hence preclude any positive 
allocative role for unemployed resources. 
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created by a match in each sector (as a consequence of the initial 
relative price change). This gradual adjustment of employment levels 
introduces an asset appreciation effect that differentiates the short- 
run response and that must be taken into account when agents dis¬ 
count the future. 

For example, an increase in the relative price of commodity 1 (p\ > 
p 2 ) increases the relative size of the joint surplus associated with a 
match in sector 1. Since this raises the return to participating in that 
sector’s matching process, unattached agents begin to leave sector 2 
for sector 1. This causes the rate of matching to rise in sector 1 and to 
fall in sector 2; that is, i\, > 0 and e 2 , X 2 <0 (see n. 21). Recall that 

sign(S') = sign(CJX,), where C\ < 0. Thus, starting in a steady-state 
equilibrium, the joint surplus depreciation in the expanding sector 
partially offsets its relative price gain, and the corresponding surplus 
appreciation in the contracting sector partially makes up for its rela¬ 
tive price loss. Therefore, the real short-run effects of a price change 
may be uncertain; that is, p i > (<)pa yields pi + C\X\ < {>) p] and/> 2 
+ C 2 X 2 > (<) p 2 . 

It can therefore be shown that an increase in the relative price of 
the labor-unemployment intensive commodity increases the perma¬ 
nent income of unemployed workers relative to vacant firms in the 
short run. Although unemployed workers (vacant firms) gain (lose) in 
terms of the capital-vacancy (labor-unemployment) intensive good, 
they may also lose (gain) in terms of the other commodity when, in 
particular, all agents are impatient. 21 That is, if pj > p 2 and sector 1 is 
labor-unemployment intensive, the short run may be characterized by 
an “attenuation effect,” whereby p\ > rW u > rW v > p 2 . 22 


21 To establish these results, set N = fc = 0 and solve (21a) and (21b) for the output 
changes^, = (- 1)'~ l [E,(rlf'„ - rVC , ll )/|X|], where E, = X K ,8 A + \ N) 8 A - > 0, i,j - 1.2, i/ 
]■ In turn, substituting these expressions into (21c) and (2Id) and solving for rVV u and 
rVV„, we get 


rtf„ - rtf’,, 



- p 

rW u - p . 2 = (' 


( Ok i + CiE t \/ pi — p 2 

\ |x| A 

- < 

W 


W A lei 

_ l ^K'i ~ CjE^ j p | ~ p 2 

A let 


)■ 

> 


where |9| = |0| - [(C'£, + Cp£ 2 )/|X|], Since sign|0| = sign|A| and B, < 0, si(pi|9| = 
sign|0|. Therefore, p ] > p 2 implies X i > 0 and < 0. Furthermore, if sector 1 is labor- 
unemployment intensive, so that |0| > 0 and |A| > 0, then p t > p s implies rW u > rW„ 
and rw u > p 2 but need not imply rW u > pi as C!£,/|A| < 0. 

!2 If we interchange labor and capital, unemployment and vacancy, and worker and 
firm, this proposition also describes the short-run distributional effects of an increase in 
the relative price of the capital-vacancy intensive commodity. 
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Since C\ approaches zero as r goes to zero, unemployed workers 
are more likely to gain in terms of both commodities (rW u > p\> fo) as 
agents become increasingly patient and as steady-state considerations 
become more important, Intuitively, farsighted agents are more con¬ 
cerned with where the economy is going than with how it gets there. 
As a result, it becomes less important whether the joint surplus associ¬ 
ated with a match in sector i is initially appreciating rapidly or slowly. 
Thus, as the C‘X, terms in (21c) and (2Id) go to zero, the magnifica¬ 
tion effects derived earlier from (21c') and (2Id') emerge. 


B. Endowment Changes 

If we hold world relative commodity prices fixed, an increase in the 
endowment of one factor causes a more than proportionate increase 
in employment in the sector whose matching process uses that factor 
relatively intensively (as determined by its labor-capital ratio) and an 
absolute decline in employment elsewhere. The basic idea is that a 
fixed commodity-price ratio implies a fixed steady-state ratio of un¬ 
employed workers’ to vacant firms’ permanent incomes, from (21c') 
and (2 Id'), which implies a fixed unemployment-vacancy ratio in each 
sector. Thus to maintain these ratios and hence maintain the proba¬ 
bilities of finding a trading partner in each sector, any increase in, say, 
the capital stock must be absorbed entirely by the expanding capital- 
intensive matching process, which also absorbs some capital and labor 
(though mostly labor) from the contracting labor-intensive process. 
From equations (21a') and (21b'), for example, if h > k and sector 1 is 
labor intensive, then |Xj > 0 and > h > k > r 2 . 

Since the steady-state permanent incomes of unattached labor and 
capital do not respond to endowment changes, it follows, therefore, 
that the objections raised either by unemployed workers to more 
open immigration policies or by the owners of idle plants to direct 
foreign investment can be understood here only as part of a short-run 
analysis in which agents discount the future. The reason is that unem¬ 
ployed workers’ and vacant firms’ permanent incomes respond to 
factor endowment changes in the short run, even when commodity 
prices are fixed. As demonstrated earlier, the reason is that the conse¬ 
quent change in the rate of matching in each sector changes the rate 
of appreciation of agents’ asset values. 

In particular, p, = 0 (i = 1, 2) together with (21c) and (2Id) imply 
that 

_ r q; _ C I% “ Cf * 2 

“ rW * r~“- ; 

This says that unemployed workers gain relative to vacant firms 
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when, as a result of the relative change in the rates of matching in 
both sectors, the relative surplus created by a match in the labor- 
unemployment intensive sector rises. Combining this with (21a) and 
(21b) yields the following results: An increase in the endowment of 
labor (capital) initially enters the economy in the form of unemployed 
workers (vacant firms). In the short run, this increases the discounted 
stream of output produced in the labor-unemployment (capital- 
vacancy) intensive sector and decreases the permanent incomes of 
unemployed workers (vacant firms). If agents are not too impatient, 
the discounted stream of output also tends to fall in the capital- 
vacancy (labor-unemployment) sector and the permanent incomes of 
vacant firms (unemployed workers) tend to rise. 23 

The complicating short-run element is again easy to see. In the 
event that an increase in the supply of labor draws resources out of 
the capital-vacancy intensive sector and into the labor-unemployment 
intensive sector, it will cause the rate of appreciation of agents’ asset 
values to fall in the former expanding sector and to rise in the latter 
contracting sector. This in turn will encourage unattached agents to 
reverse direction, moving back toward, or staying in, the capital- 
vacancy intensive sector. 

VIII. Incidence of the Corporation Tax 

This section uses the closed-economy version of the model to describe 
the short-run and steady-state effects of an unanticipated tax on the 
income of employed capital in sector 1 (the “corporate” sector) on the 
relative discounted income streams of unemployed capital and labor. 


Substituting the expression in the text into (2la) and (2lb) and setting R = 0 yields 

s,. ( -i V J>s JLtavW!!£, 

I A I 


where sign|A| = sign|A| as 

C'5n \ 




CjS/c 


CfSjv \ 


- W - 


|e| 

cls.vx K2 + Cffi K X N\ + C S + C1\ki&w 


- wi«l + H 

let ’ 

where H > 0 as C, < 0. To establish the proposition, suppose that sector 1 is labor- 
unemployment intensive, so that |0| > 0 andjA| > 0. In this case, given $ > 0, the 
equations above yield > 0 and C'$i < C|a 2 , and hence (21c) and (2id) yield 
> C$ 2 < C$i > r$' u . While the sign of is generally ambiguous, it becomes negative 
as Cj goes to zero, and this occurs when, in particular, this discount rate approaches 
zero. 



JOURNAL OF POLITICAL ECONOMY 

As this exercise is only an illustrative application rather than an ex¬ 
haustive treament of tax issues, the standard simplifying assumptions 
are adopted: (i) the corporation tax is introduced at an infinitesimal 
level, (ii) there are no other taxes, and (iii) all proceeds are given to 
consumers as a lump-sum subsidy. 

In sector 1, employed workers are paid w, and filled firms gross/), 
- toj; given tax rate t/T, where 7=1+1, firms net (/>, - w\)/T while 
the government receives <(/>, - w\)/T. Therefore, the expected life¬ 
time income of a filled firm in sector 1 satisfies 

rW} = Pl ~ Wl - b(W} - W' v ) + W}. (4c') 

The asset values of the remaining private agents in sectors 1 and 2 are 
again described by (4). On the public side, let Gy and G„ denote, 
respectively, the expected discounted lifetime tax revenue associated 
with filled and vacant firms in sector 1, so that 

rCy = *&• ~ w - x) - b(G f - G v ) + Cy, (25a) 


tG v = (Gy - G v ) + G„. (25b) 


Therefore, W} + Gy and W,! + G v represent the expected discounted 
joint income, for a firm and the government, associated with filled and 
vacant firms, respectively. Moreover, from (4), (4c'), and (25), 


Wl = 


Wl + G x 


G, = 


t(W' x + G x ) 


= /, v. 


Observe that the joint surplus created by a match in sector 1 is X 1 = 
W l t ~ Wl + Wj - Wl + Gj - G„. Now, given assumptions 1 and 2, the 
short-run and steady-state closed-economy analyses go through ex¬ 
actly as before, except that the joint firm-government asset values, W} 
+ Gy = TW} and + G„ = TWl replace Wj and W,!. 

In the short run, employment and market supply are fixed in each 
sector, and hence the commodity price ratio is constant. Thus, given 
the tax modifications to (21) detailed in the Appendix, Section C, we 
have 

rW u - rW v = lel-'K/i*! -y 2 tf 2 ) - ejc.fj, 

wherey, = C‘, + D\ — C\ — D 3 , (j ^ i). The immediate effect of the tax 
is to decrease the wealth of employed capital in sector 1 and therefore 
to decrease the asser+alue of vacant firms in that sector. This induces 
vacant firms to move from sector 1 to sector 2, which causes the 
unemployment-vacancy ratio to rise in sector 1 and to fall in sector 2. 
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As a result, the probability that an unemployed worker will find a job 
in sector 1 ( 2 ) falls (rises), and given the consequent effect on his asset 
value, this induces unemployed workers also to move from sector 1 to 
sector 2. Therefore, the initial effect of the tax is to cause sector 1 to 
contract; this effect is captured by the - 6*1 T term and is also present 
in steady-state analyses (see below). Whether this contraction alone 
causes unemployed workers to gain or lose depends on sector l’s 
factor intensity. 

As unattached agents begin to leave sector 1 for sector 2, however, 
the rate of matching tends to fall in 1 and rise in 2 ; therefore, output 
will be falling in 1 and rising in 2. This causes the output price and the 
joint surplus created by a match both to appreciate in the contracting 
sector and to depreciate in the expanding sector. The own-sector 
price-surplus effects are captured by the (Cj + D‘,)X, terms while the 
cross-sector price-surplus effects are captured by the (C\ + D > ,)X / 
terms. Whether sector 1 expands or contracts on net depends on the 
sign of (J\X\ - J 2 X 2 ) - 8 *if', which is ambiguous. 

The steady-state case is also straightforward. From the tax- 
modified version of (21c') and (2Id') in the Appendix, Section C, we 
have 

- rW v = |»|“‘[(pl - h) - 

The short-run reallocation of resources to sector 2 causes sector l’s 
relative price and joint surplus to appreciate because of the relative 
decrease in the rate of matching in sector 1. Across steady states, 
however, surplus and asset appreciation are absent while the cumula¬ 
tive counterpart to price appreciation is a price level change; the 
contraction of 1 causes the relative supply of commodity 1 to fall and 
hence causes its relative price to rise. This accounts for the offset, pi 

- p 2 , to the initial contractionary effect of the tax, — in the 
expression above. Alternatively, note that J\X\ — J 2 X 2 approaches pi 

- p 2 in the limit as r —* 0 (see n. 19). 

Solving the remaining steady-state equations in the Appendix for pi 

- p 2 yields 24 

(rW u - rW v )R = (So, - 

where S > 0 and R is positive when the economy’s supply curve 
intersects its demand curve from above (Atkinson and Stiglitz 1980). 
In these circumstances, we have the following results: In response to 
the introduction of an infinitesimal tax on the income of employed 
capital in the corporate sector, the steady-state asset value of vacant 
capital will fall relative to unemployed labor whenever the matching 

** R — cto| 0 | jA) + hfj + S Kl and S = 1 + 1 i 
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process in the corporate sector is capital intensive (|X.| < 0). Alterna¬ 
tively, the asset value of vacant capital will rise relative to unemployed 
labor when the matching process in the corporate sector is labor in¬ 
tensive (|Xj > 0) and Walrasian (cti = 0). Recall that the Walrasian 
matching process in (14) has a fixed coefficients technology and hence 
has a zero elasticity of substitution. 

Finally, because workers' and firms’ average lifetime incomes, in the 
absence of discounting, do not depend on their employment status at 
a point in time, 25 rWJ = rW' u , rW} = rW' v , and 

(rW e - rW f )R = (So, - 6 jn o' 0 |*|)r 

likewise describes the effect of a corporate tax on the relative equi¬ 
librium average incomes of employed workers and filled firms. 


IX. Empirical Matching Technologies 

Applied general equilibrium analysis is concerned with mapping the 
Arrow-Debreu representation of a frictionless economy into models 
that are amenable to simulation and policy evaluation (Shoven and 
Whalley 1984). Once preferences, endowments, and production tech¬ 
nologies are specified using actual data and specific functional forms, 
this exercise is straightforward. For constrained efficient economies 
characterized by trade frictions and search, the set of primitives must 
be expanded to include sectoral matching technologies. Therefore, 
the additional empirical task in the latter situation is to secure esti¬ 
mates of these matching technologies. 26 It is interesting to note that a 
large number of economywide matching technologies have already 
been estimated, though only indirectly and inadvertently. These 
matching technologies are simply by-products of estimated hazard 
models, and in principle, the same approach can be used to estimate 
sectoral hazard functions and hence sectoral matching technologies. 

The hazard rate is the transition probability from unemployment to 
employment. The hazard rate in sector i of the model described in 
this paper is the probability that an unemployed worker finds a vacant 
firm, m'(u„ v,)lu,; given constant returns to scale in matching, this 
hazard rate is a function only of the unemployment-vacancy ratio. 


** For example, if we drop sector-specific notation, W, - = (32 and (7a) and (7c) 

yield 


lim rW„ = lim rW, = 

i —»0 r —0 


_ p(mlu) P 

b + (m/u) P + (m/v)(i 


P)' 


** Estimating the dermttd-side elasticity of substitution and assigning numbers to the 
remaining short-run and steady-state parameters (input-output ratios, factor shares, 
and factor proportions) is conceptually no more (or less) difficult than the counterpart 
exercise in the neoclassical framework. 
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More generally, the hazard rate at time t for worker k, given that his 
current unemployment spell began at t < t, can be represented by 
(see Lancaster 1979; Nickell 1979; Flinn and Heckman 1983; Burdett 
et al. 1984; Topel 1984; Ham and Rea 1987) A*(t, t) = |a*(t, /)it*(t, (), 
where p*() is the arrival rate of job offers and ir*() is the probability 
that an offer is acceptable, that is, the probability that an offer exceeds 
A’s reservation wage. For economies in which transitions among states 
are first-order Markov, including the economy described in this pa¬ 
per, the corresponding hazard rate is time independent and hence 
fails to exhibit duration dependence. The remaining portions of this 
section concentrate on time-independent hazard rates. 

A matching technology is derived from individuals’ equilibrium 
hazard rates as follows. Suppose that workers are indexed by their 
search costs. With sequential search, worker k's reservation wage, ft*, 
is a decreasing function of his search cost, c*; hence ft* = ft(c*). Thus, 
for a given cumulative distribution of wage offers, F(w), it* = 1 - 
F(ft(c*)) or, more compactly, it* = ir(c*). Assuming that the probability 
of receiving a job offer depends on a worker’s search intensity, which 
in turn depends on his search cost, we can write p.* = tt(c*). Since the 
functions ft( ), F( ), and p,( ) will be endogenous in the underlying 
market equilibrium, being the aggregate consequences of the partici¬ 
pants’ search and wage-setting strategies, they will generally depend 
on the numbers of participants on the supply and demand sides of the 
market, u and v. Therefore, A* = (c*, u, v), and so the corresponding 
matching technology is given by m(u, v) = u J h[c, u, v)il{c)dc, when 
search costs are distributed as c ~ ft(c). 27 

Functional forms commonly employed in empirical work include A* 
= exp(yX*) (Lancaster 1979; Flinn and Heckman 1983; Burdett etal. 
1984; Topel 1984) and A* = [1 + exp^X*)]' 1 (Nickell 1979; Ham 
and Rea 1987), where X* is a vector of explanatory variables that are 
assumed to affect this transition probability. Thus X* should include 
variables reflecting market conditions (u and v; the cited studies in¬ 
clude only u) and the determinants of search costs. Unemployment 
compensation is an obvious example; however, personal and demo¬ 
graphic variables (say age, education level, assets, and family size) will 
also proxy search costs to the extent that they are correlated with 
search efficiency or off-the-job income. These variables have in fact 
proved successful in empirical work. To estimate a constant returns to 

27 When one is estimating a matching technology in a general equilibrium setting, 
it may be incomplete to specify this technology as a function only of the numbers 
of participants in the matching processes. That is, in economies in which preferences 
or technologies are random, the derived matching technology will also be state depen¬ 
dent when, e.g., either individuals' equilibrium hazard functions or the distribution of 
search costs is state dependent; i.e., h(c k , u, v, S ) or H(c, S) implies m(u, v, I). 
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scale matching technology for sector i, in particular, unemployment 
and vacancies must appear together as w,/^ in X*. 

Whether or not factor market search has empirically significant 
general equilibrium implications, beyond providing an equilibrium 
explanation for aggregate unemployment, is an open, important, but 
as yet ill-defined question. The main contribution of this paper is the 
development of a relatively simple multisector matching model that is 
suited to formulating precise questions concerning the aggregate con¬ 
sequences of search in the context of conventional applied problems. 
The specific functional forms described above can be used in future 
work to estimate sectoral matching technologies as a step toward pro¬ 
viding quantitative answers to these questions. 


Appendix 


A. Proof of Proposition 1 

Dropping sector-specific notation for the moment, let / denote aggregate 
sectoral income net of asset changes: 

I = u(rW u - W u ) + e(rW e - W,) + v(rW v - W v ) + e(rW f - W,). (Al) 

Substituting from (4) and (6) into (Al) yields / = pe + l(m - be). Since 1 = 
p(3Q/3e) in a steady-state equilibrium (see [11]), (8a) gives 


1 = p(rQ). (A2) 

Alternatively, substituting u = n - e and v = k - e into (Al) yields 

I = n(rW u - W u ) + k(rW v - W v ) + e[r£ - (VV, + W f - W v - W„)j, 

which, on substitution for I from (6), gives 

/ = (n - pe)(rW u ) + {k - p e)(rW„) + ppe 

/om\ • . . < A3 > 

-(■V-jt W t + W f )-{uW u + vW v ), 

where p = r/(r + b). Equation (15a) then follows from (A2) and (A3) and the 
following two equations: 



u,Wl + ViW' v = tri 


0€] 0€\ 


bt\) + m' 


arag., 

de 2 &2 


- be 2 ). 


In turn these relationships are established as follows. Adding (4a) and (4c) 
gives ■*. 


r(Wi + Wf) 


Pi - bl' + 


W[ 

Bet 


+ 



(m l — be j) + 
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W± <W)_ = (Bp,/Be y) - bQZ'Ide,) = dT\ 3Q, 

Btj de, r + b + m{, + m{ \ 

(see [8c]). Multiplying (4b) by u, and (4a) by v, and adding gives 

r(u,W‘ u + r.Wi) = m'T + (u,- be x ) 

\ Be x ) 

. / aw;, aw[,\, 2 , . 

+ w.-r— + v,— — \(m - be 2 ). 

\ 0 €2 0 €2 / 

In turn, taking derivatives and using rW' u = roJ,S' and rW' v = m[S' from (7a), 
(7b), and (13) in a steady state gives 


u, ■ 


dWl 

Be, 


+ v, ■ 


ajn 

Be, 


m'tfl'IBe,) 


t + b + mL + mh 


B. Proof of Proposition 3 

Equations (17a') and (17b 1 ) yield 

+ k N2 e 2 = n — (b Ni a^ x + b N2 a N2 ), 

k/cl«l + b K2 e 2 = k - (\ KX & KX + b K2 & K2 ). 

As before, I want to describe X,idi,i 4- X, 2 & l2 as a function of rV^„ - rVt',,, and as 
before, I solve 

0 = 0(v,d i = 1,2, 

and the expressions for cr, for the & tJ as functions of rW u - rVV„. This yields 
(21a') and (21b'). 

Equations (17c') and (I7d') yield 

0.V|(4V^") + 0Xi( r ^v) — pt ~ (fl.VrdAJ, + i = 1,2. 

Whereas the short-run percentage change in the input-output coefficient a Nl 
= (n, - p e,)/(rQ, - pe,) equals 

, _ _ dn, _ drQ, _ 

N ‘ n, - [re,I(r + b)] rQ, - [re,l(r + b)]’ 
the corresponding steady-state term is different and equals 

_ dn, - [ rde,!(r + 6)] _ drQ, - [rdej(r 4- 6)] 

N ‘ n, - [re,/(r + b )] rQ, - [re,!(r + b )] - 

The same distinction applies to the die,. Thus since 

pMrQ, - P de.) = p^^-^jrri{dn, - p de,) 

+ Pii^jmtidki - p de,) 

= (rW‘ u )(dn, - p de,) + (rW v )(dk, - pde,), 

it follows that Bs,6n, + 0/ci^x. are zero in steady-state equilibrium. This yields 
(21c') and (21d'). 
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C. A Tax on Capital Income in Sector 1 

In (21c) and (21c'), + t replaces rtf'* and, in the short-run and steady- 

state elasticities, crj and &\. Therefore, the short-run tax model is summarized 

by 

X/vi-^i + XjvA = N + 8tf(rl# u - rW v ) - X (vl O it |CT 1 !f, 

Xxi-^i + X^'2-V'2 ~ K — 8/f(rVi' u — rW v ) + \ 6,v \C7 \T, 

0jvi( r ^u) + fljn(rl^v) = pi + (C{ + -D{)^i + (f-2 + £> 2)^2 - 8/ci^. 

0jw( r ^») + = pi + (Ci + £>?)^i + (C| + £>1)X 2 , 

pi = fa- 

The steady-state tax model is summarized by 

\vi^i + \v2^2 = d + 8 A (rV( / „ - rW„) — k N1 ^ K1 & l T, 

+ Xx 2«2 = k — S*;(rVC' u — rVt',,) + 

A^ilrV^u) + 8xi(rVt , „)= p\ - 0jnf\ 

0 /V2( r ^u) + 0K2( r ^t<) = p2< 

i\ - h - ~<*d(P\ ~ pi)- 
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Unemployment, the Market for Interviews, 
end Wage Determination 


Michael Sattinger 

State University of New York at Albany 


A model of equilibrium unemployment and vacancies is presented 
in which the absence of a market for interviews can yield exter¬ 
nalities and inefficiency. Inefficiency results in market forces that 
lead firms to charge fees for interviews or to change the wage rate. 
These market forces require substantial information and may not 
operate. A graphical analysis shows how such forces would cause 
wages to adjust to shifts in supply, taxes, and transfers between 
workers and firms. With excessive unemployment, efficiency re¬ 
quires a transfer of income from workers to firms. 


I. Introduction 

This paper examines the causes of and cures for inefficient equilib¬ 
rium levels of unemployment in an economy, using a model of match¬ 
ing between unemployed workers and vacant jobs. Inefficiency is 
defined to occur when an additional worker’s contribution to the 
present value of future production differs from the worker’s forgone 
opportunities outside the labor market. This inefficiency arises from 
the absence of a market for interviews between unemployed workers 
and firms with vacancies. When inefficient levels of unemployment 
arise, market forces may lead firms to change the wages they pay or to 
charge or pay fees for interviews. But these market forces require 
substantial information and may not operate. Movement of the econ¬ 
omy toward efficient levels of unemployment requires shifts in the 
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distribution of income between workers and firms, brought about by 
fees for interviews, wage changes, or taxes and transfers. 

A fundamental question in the study of labor markets is whether 
the decentralized decisions of workers and employers yield efficient 
levels of unemployment and vacancies. While some authors have ar¬ 
gued that search procedures of workers and firms are generally 
efficient (see discussions by Prescott [1975] and Hall [1979]), others 
have suggested that externalities and congestion cause inefficiency 
(Tobin 1972; Phelps 1972). 

More recent analysis of labor markets involving search suggests a 
number of potential sources of externalities and inefficiency. Dia¬ 
mond (1982) analyzes the entry decisions of workers and finds that 
with a Nash bargaining solution, the expected present discounted 
value of earnings of a new worker does not in general equal the 
worker’s social marginal product. Decisions determining search inten¬ 
sity (Mortensen 1982a, 19826; Pissarides 19846) and which matches to 
form (Pissarides 1984a) can also generate externalities (see also Frank 
[1985] and a summary by Mortensen [1986]). 

The model developed in Section II concentrates on entry decisions 
as the source of externalities. It shares many essential features with 
Diamond’s model and others but in the initial version takes the wage 
rate as given instead of determined by Nash bargaining. In the model, 
the pools of unemployed workers and vacancies reach equilibrium 
levels when the rate at which matches form equals the rate at which 
they dissolve. Section III then analyzes the externalities in terms of 
the absence of a market for interviews. Section IV considers the cre¬ 
ation of an explicit market for interviews if firms can influence the 
number of job seekers for a vacancy by charging or paying a fee for 
an interview. Then the equilibrium fee will be such that externalities 
are eliminated. A similar line of reasoning is applied to the wage rate 
in Section V. A graphical analysis shows how the wage rate and the 
unemployment to vacancy ratio must adjust to yield efficiency and 
how they respond to shifts in the supply of workers. This efficient 
wage mechanism is then compared with alternative wage determina¬ 
tion models, such as supply and demand, Nash bargaining, and se¬ 
quential bargaining. The final section (Sec. VI) draws conclusions 
concerning appropriate policies and the economy's response to taxes 
and transfers. 

II. The Model 

Suppose that a match or combination of a single worker with a single 
job yields production of y per period using a technology with fixed 
proportions. Let L be the total number of workers ,J the total num- 
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ber of jobs, E the total number of matches (equal to both the number 
of employed workers and the number of filled jobs), V the number 
of unemployed workers, and V the number of vacant jobs. With E 
matches, total production in the economy is Ey per period. 

A match is formed with probability q when a firm interviews an 
unemployed worker for a vacant job. The number of interviews oc¬ 
curring per period is a function of the numbers of unemployed work¬ 
ers and vacant jobs, p = p(U, V). The function p(U, V) is assumed to 
be homogeneous of degree one, have continuous first derivatives, 
and be less than the minimum of either U or V. Matches break up with 
probability 7 per period, so that the pools of unemployed workers 
and vacancies are constantly replenished. In their likelihood of get¬ 
ting an interview, forming a match, or breaking up from a match, all 
workers are identical. Similarly, all jobs are identical. 

Worker movements between the two states of employment and un¬ 
employment are described by a two-state, continuous-time Markov 
process. The likelihood of getting an interview per period is the num¬ 
ber of interviews per period, />(£/, V), divided by the total number of 
unemployed workers, U. The transition rate from unemployment to 
employment, X, is then the likelihood of getting an interview per pe¬ 
riod, p/U, times the probability that the interview results in a match, q: 


X = 


P9 

V 


( 1 ) 


The transition rate from employment to unemployment is given by 
the rate at which matches break up, 7 . In equilibrium, the unemploy¬ 
ment rate u, given by UIL, is related to the transition rates as follows: 


U_ _ 

L X -t- *y ’ 


( 2 ) 


Similarly, the movements of jobs between the states of filled and 
vacant are described by a Markov process. The transition rate from 
vacant to filled, p, equals the likelihood of getting an interview per 
period, p/V, times the likelihood that an interview results in a match, 
< 7 : 


P- 



(3) 


Again, the transition rate from filled to vacant is given by 7 , so that in 
equilibrium the vacancy rate is related to the transition rate as follows: 

_ _V _ 7 

J ~ H + y 


v 


(4) 
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In equilibrium the number of matches formed wijl equal the num¬ 
ber of unemployed workers getting jobs, the number of vacant jobs 
filled, and the number of matches breaking up: pq = kU = = yE. 

Then by substituting L - £ for U and J - E for V, we get 

p(L ~ E,J - E)q = yE. (5) 

This expression can be solved for the equilibrium value of E, which 
then determines the equilibrium values of all other variables. 

Assume that a worker gets a wage of w per period while employed, 
leaving profits of y — w per period to the firm offering the job. Let W v 
and WV be the expected present discounted values of future wages for 
unemployed and employed workers, respectively. Let W v and W/r be 
the expected present discounted values of future profits for vacant 
and filled jobs, respectively. Let r be the discount rate. Diamond 
(1982, p. 220) derives the equilibrium values of these wealth terms 


from the following relations: 


rWV = (^)(WV - WV), 

(6) 

rWV = w - y(W E - WV), 

(7) 

2 * 

£ 

i 

£ 

g 

» 

K 

(8) 

ii 

i 

s 

i 

? 

i 

r 

(9) 

These relationships may be understood as follows. For each state 
(e.g., unemployment), the equivalent flow of income for a level of 
wealth equ'als any wages or profits received per period, plus the prob¬ 
ability of changing states times the wealth gain or loss from such a 
change. Subtracting (6) from (7) and (8) from (9) and solving yields 

we-”v= r+y : ipqlU) 

(10) 

and 

(11) 

Then from (6) and (8), 


' wv - ^ w 

u rU r + y + (pq/U) 

(12) 

and 


Wv - ft y~ w . 

v rV r + y + ( pq/V) 

(13) 
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So far, the model has closely followed Diamond’s development of 
search equilibrium. However, Diamond imposes a bargaining solu¬ 
tion in which the surplus from a match is divided equally, so w is 
determined by the condition that W E - W v = W F - Wy. Instead, this 
model considers the outcome with arbitrary w. Later, in Section V, the 
determination of the wage rate will be considered. 

Finally, assume that workers enter the labor force whenever the 
expected benefits per period, rW v , exceed the forgone opportunities. 
As rW v increases, the number of workers rises until the value of for¬ 
gone opportunities for the marginal entrant again equals rWu. Simi¬ 
larly, the number of jobs will depend on the expected benefits for a 
vacant job, rW v . We may suppose that supplies of workers and jobs 
to the economy are given by 


L = UrW v ) a 

(14) 

J = Jo(rW v f, 

(15) 


where a and (3 are positive. 

The next section considers whether the resulting equilibrium is 
efficient. 

III. Externalities of Entry Decisions 

The entry of one more worker into the labor force will raise the 
unemployment rate for the other workers and reduce the vacancy 
rate for firms, thereby benefiting firms and imposing costs on the 
other workers. These benefits and costs will not be considered by the 
worker in deciding whether to enter the labor force. Instead, he or 
she will look at the expected present discounted value of future 
wages, given by W v . If the benefits and costs of this decision do not 
balance, the worker’s entry will impose positive or negative net exter¬ 
nalities on the rest of the economy (the private benefits and costs will 
already be balanced for the marginal entrant). 

Followfpft Diamond (1982), the net externality of an additional 
worker (Sit be analyzed by comparing the present discounted value of 
an additional unemployed worker’s social marginal product (briefly, 
the worker’s MP E ) with the present discounted value of the private 
return, given by Wy. The worker’s MP L arises from future changes in 
total production as E moves from its steady-state equilibrium value to 
its new value. Diamond (1980) has elegantly solved the problem of 
calculating this present, value and applied it (1982, p. 223) in the 
context of search equilibrium. From (5), by implicit differentiation 
with respect to L, the steady-state change in employment from the 
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addition of one more worker is 


BE _ _ qpi _ 

BL y + qpi + qp 2 ’ 


(16) 


where pi = dp/dU and p 2 = dp/dV. By Diamond’s result, the expected 
present discounted value of the future change in production is a 
simple modification of the steady-state change times output per 
match, y: 


MP, 


_ qhy _ 

r(r + y + qp\ + qp 2 ) ' 


(17) 


Let NE L be the net externality generated by one more worker. 
Then 


NEi = MPl. ~ W „. 


(18) 


Since p{U, V) is homogeneous of degree one, p\ and p 2 are homo¬ 
geneous of degree zero, so that their values depend only on the ratio 
UIV. Then MP L , W u , and NEl depend only on the ratio UIV and not 
on the absolute levels of unemployment and vacancies; that is, dou¬ 
bling U and V will leave NEl unaffected. Therefore, the condition 
that NEl = 0 determines the ratio of unemployment to vacancies such 
that no net externalities occur. 

If NEi, > 0, the economy will be inefficient in the sense that the 
gain in production from an additional entrant is sufficient to allow all 
losers to be compensated, with some production left over. The econo¬ 
my’s market equilibrium would not be Pareto optimal. Similarly, if 
NE l < 0,'the gain in alternative activities of an unemployed worker 
exiting the labor market would allow a redistribution that would leave 
all losers compensated, with some production left over. The require¬ 
ment that NE l = 0 can also be derived as a first-order condition from 
the maximization of the present discounted value of future produc¬ 
tion subject to the constraint that the cost of factors, LW V + JWy, not 
exceed some constant level. 

The condition for the efficient ratio of UIV can be expressed more 
simply as follows. Let NEj be the net externality of one more job. The 
equivalent expression for NEj corresponding to (18) is 


NEj = MPj - W v 


_ SM _ 

r(r + y + qpi + qp 2 ) 


W v . 


(19) 


Then setting NE L and NEj equal to zero, one obtains from (18) and 
(19) 
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The three conditions NE L = 0, NEj = 0, and (20) are equivalent. 1 
The condition in (20) means that, for efficiency, the ratio of the 
wealth terms for an unemployed worker and a vacant job must equal 
the ratio of their contributions to interviews. If W V IW V exceeds pi/p 2 , 
the net externality of an additional worker, NE L , will be negative and 
NEj will be positive; if W(jlW v is less than p\lp 2 , then NE L is positive 
and NEj is negative. 

The efficient condition presented by Diamond (1982, p. 225, ex¬ 
pression 35) is a special case of (20), derived under the assumption 
that the surplus from a match is shared. This assumption implies that 
UW[, — VW v , so that Wrj/W v = V/U. The condition for efficiency is 
then UP\ = VP?, and the other results follow directly. 2 


IV. The Market for Interviews 

It is well known that the absence of a market for a good is a cause of 
externalities. This section analyzes the externalities described in the 
previous section in terms of the absence of a market for interviews. 
Interviews are produced by inputs of unemployed workers and va¬ 
cant jobs. But these products are neither bought nor sold. A market 
for interviews would reward workers according to the number of 
interviews generated, regardless of who receives or loses them. Net 
externalities would then be eliminated. 

The value of an interview to a worker is the chance of moving from 
unemployment to employment times the gain in wealth, q(W E - W f J. 
Similarly, the value of an interview to an employer is q(W E - W v ), so 
the total value of an interview is q(W E — W v + W E — W t/ ). Efficiency 
in a market for interviews would require that the compensation 
needed to get one more worker to join the labor force, rW v , equal the 
value of interviews generated: 

rW v = qp x (W F - W v + W s - W v ). (21) 

An alternative way of deriving this condition is to calculate the exter¬ 
nalities generated by a worker’s entry. One more worker gets on 


1 First, the wealth levels for the economy’s participants satisfy the budget constraint 
EW e + UW V + EW F + VWv = Eylr. Manipulation using (6)-(9) yields (yL + rU)W v + 
+ rV)W v = yEylr. If NE L »= 0, substitution of MP L for VV f , in this budget constraint 
and manipulation yield NEj = 0. Similarly, NEj = 0 implies NE r « 0. Also, if Wu > 
(<) MPij then from the budget constraint W v < (>) MP,. Therefore, (20) implies NE L 
= NE, = 0. 

*Tne linear technology studied by Diamond (1982) and others implies that the 
number of interviews per worker increases indefinitely as the number of unemployed 
falls to zero. This is incotuistent with the assumption that the number of interviews 
cannot exceed the number of cither unemployed or vacancies. 
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average plU interviews per period but generates only pi extra inter¬ 
views. The value of an extra interview generated for firms is q(W F - 
W v ), for a total positive externality of qp\(W F - Wy). At the same time, 
the worker reduces the number of interviews for other workers by 
pi - ( plU ), valued at q(W E - W v ) per interview. Setting the sum of 
the two externalities equal to zero and applying (6) again yields (21). 
The corresponding condition for employers is 


rW, = qp 2 (W F - W v + W E - W v ). (22) 

If conditions (21) and (22) hold, then W v /W v = p\/p 2 and the condi¬ 
tion for zero net externalities in (20) is satisfied. 

One way a market for interviews could come about is through a fee 
paid or collected by a firm for an interview. 4 This fee would bring 
about a transfer from workers to firms (or from firms to workers) 
until condition (20) is satisfied. 

Suppose that employer i charges a fee z, for an interview whenever 
a vacancy occurs. Assume that workers know the fee level and also the 
likelihood of getting an interview at the firm. The wealth equation for 
a worker seeking a job at employer i is then 


rW Vt 


qp{U„ V,) 

U, 


( W £ - w Ut ) 


P(U„ V.) 
11 


(23) 


where p{UJV,)/U, is the worker’s likelihood per period of getting an 
interview. The firm cannot directly control the frequency of inter¬ 
views. However, if the interview function p(U, V) is taken as given, the 
firm can influence the likelihood of interviews by charging a higher or 
lower fee.- If the firm raises the fee z,, the value of r\V Ut initially falls 
below the value elsewhere, rW L ,. Fewer workers would then seek jobs 
at the firm; this reduction in 17,/V, would continue until rW Vl = rW v . 
The firm therefore faces a trade-off between the fee it is able to 


5 Conditions (21) and (22) are equivalent. From (6) and (8), 

t(UW v + VW v ) - qp(U, V){W F - W v + W E - W v ) 

- q(Up, + Vp 3 ){W F -W v +W £ - Wu). 

Then taking U times (21), subtracting from this expression, and dividing by V yield 
(22). Also from this expression, if rW v > (<) qpi{W F - W v + - W v ), then rW v < 

(>) qpt(W F - WV + W F - W v ), so (20) implies (21) and (22). The efficiency condition 
can also be shown to minimize the cost per interview. A combination of U unemployed 
workers and V vacancies produces p(U, V) interviews. With rW v as the opportunity cost 
of an additional unemployed worker and rW v as the opportunity cost of an additional 
vacancy, the cost per interview per period is t(UWu + VW v )/p. 

4 An alternative to firm interview fees is a system of labor exchanges, which would 
pay or charge workers and firms according to their contributions to interviews. Such 
labor exchanges could also lead to efficiency (Sattinger 1984). 
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charge and the ratio of unemployed seekers per vacancy. The firm’s 
wealth equation is given by 


rW Vl 


qp(U„ V ■) 

V, 


(W F , - wv,) 


+ 


p{U„ V.) 

V, 


(24) 


Let 0, ** Uj/Vj. A firm with a vacancy chooses z, (or the corresponding 
value of 0 ,) to maximize WV, subject to the constraint that W Vi = W rr . 
It can be shown that 


r ~ 5 T~ = . 2 „ C^i(W £ -W v + WV, - W v d ~ rW Vi ). (25) 

d 0 , r + y + pq ir 

Setting dW Vi /dQ, = 0 yields the first-order condition, which is equiva¬ 
lent to (21). The second-order condition for a maximization requires 
that <PW v ,/d6j be negative, which holds if pi i < 0. 

All employers face the same maximization problem and therefore 
obtain the same solutions for z, and WV, (and hence for 0 ,). With all 
values of 0, equal, UIV = U,IV,. Since pi is homogeneous of degree 
zero, pi(UIV, 1) = p\(U, V) and (21) holds for the economy. There¬ 
fore, the optimal choice of a fee by employers will eliminate the ineffi¬ 
ciency. Essentially, the fee generates an implicit market for interviews. 
Firms are led to maximize the total value of an interview net of costs 
of attracting workers. By changing the fee, firms then raise or lower 
the ratio of unemployed to vacancies depending on whether the value 
of interviews generated by an unemployed worker exceeds or falls 
short of the worker’s costs. 

Figure 1 shows the changes in an economy when firms’ interview 
fees bring about efficiency. From (20), efficiency arises when W V IW V 
— p\!pi- In figure 1, the relation between W V IW V and UIV, determined 
by (12) and (13), is shown by the downward-sloping line through 
points A and B. The relation between p\/p 2 and UIV is shown by the 
steeper line through A and C. The reason equilibrium does not neces¬ 
sarily occur at point A, where (20) holds, is that a third relation 
governs the combinations of Wy/Wy and UIV that can occur in the 
economy. This third relation is determined by all combinations of 
Wy!W v and UIV consistent with the supply relations (14) and (15) that 
can be achieved through redistributions between workers and firms, 
fossible redistributions are limited by the budget constraint that total 
wealth of all participants in the economy add up to the ’present dis¬ 
counted value of future production. Redistributing income from 
firms to workers (thereby raising WylWv) results in more workers and 
fewer jobs and hence a higher ratio UIV. The supply relation in figure 
1, going through C and B, is therefore upward sloping. , 

Inefficient ratios UIV can arise because the supply curve does not 
necessarily go through point A, where (20) holds. In figure 1, in the 




Fig. 1.—Market equilibrium with UIV too large 


u/v 


» 


absence of any fees or other mechanisms to redistribute income, equi¬ 
librium occurs at point B, where W v /W v > p\/p 2 and the net external¬ 
ity of an additional worker, NE L , is negative. The ratio UIV is too high 
for efficiency. 

For the economy to move toward efficiency, a redistribution be¬ 
tween workers and firms must occur. If the efficient fee mechanism 
operates, firms start charging fees for interviews when the economy is 
at point B. This reduces Wy and raises W v , shifting the Wy/Wy curve 
downward until it goes through point C, where the equilibrium and 
efficient points now coincide. The redistribution reduces the number 
of workers in the economy and raises the number of jobs, leading to a 
reduction in UIV. 

Although UIV declines, workers are worse off because the efficient 
fee reduces Wy/Wy in this case. Also, since part of the adjustment 
occurs through changes in the ratio Wy/Wy, the economy does not 
move all the way to A but only to C, where (20) now holds. 

The analysis in this section shows that a market for interviews leads 
to an efficient solution even though no agent calculates the present 
discounted value of future changes in production. From (17) and 
(21), the value of an interview, q(W F - W v + W E - W v ), equals the 
present discounted value of the resulting future production, qy/(r + y 
+ qp\ + qp 2 ). When efficiency holds, an amount spent on factors now 
(by drawing workers or jobs into the economy) causes an increase in 
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production over time with a present discounted value of the same 
amount. 5 

The informational requirements of the fee solution are large. 
Workers must know the fee and the ratio of unemployment to vacan¬ 
cies at each firm with which they seek interviews. However, these 
requirements disappear once an equilibrium is reached, when all fees 
and all ratios UJV, are identical. 

This section has considered market forces that would lead to inter¬ 
view fees in the labor market, taking the wage rate u> as given. The 
fees alter the wealth ratio Wy/Wy, moving the economy toward an 
efficient combination as in figure 1. The next section considers mar¬ 
ket forces leading to changes in the wage rate. 


V. Wage Adjustment 

Instead of fees, a firm may offer a higher or lower wage in order to 
alter the number of job seekers per vacancy. The informational re¬ 
quirements are similar to those for the fee solution. Suppose that firm 
t pays wage w, and that this wage is known to workers along with the 
likelihood of getting an interview at the firm. The wealth equations 
for an individual firm paying a wage w, are analogous to (6)-(9), with 
the variables U„ V it w„ W a „ W £ „ W Vt , and W Fl substituted for the 
corresponding variables. Workers choose among prospective employ¬ 
ers on the basis of W vi \ the numbers of job seekers at various firms 
then adjust to yield equal values ofW M . Firms therefore face a trade¬ 
off between the wage they pay and the ratio of job seekers to vacan¬ 
cies. Again let 8, = A firm chooses a wage rate such that the 

corresponding ratio 0, maximizes W Vi subject to the constraint that 
W Vl — Wu. It can be shown that the first-order condition for the firm 
is 


dW v , 

da, 


q(r + y) 
r(r + y + pq) 


(Wv,pi ~ Wupz) — 0. 


(26) 


All firms face the same maximization problem, which has only one 
solution for 8,. Therefore, all firms choose identical wages, UIV = 
Ui/V it and the efficiency condition (20) holds. The second-order con¬ 
dition for the firm’s maximization problem is again that p n < 0. 

In figure 1, the wage solution must shift the Wy/Wy curve by exactly 
the same amount as the fee solution. At point B, a firm benefits by 
lowering the wage even though it gets fewer job seekers, since pi/pz < 
Wy/Wv and dW Vi /d6j < 0. When all firms do this, the wage falls until 
the Wy/Wy curve go*»-through point C. 

5 In the problem of maximizing the present discounted value of aggregate output 
subject to a budget constraint, the value of the Lagrangian multiplier is one. Relaxing 
the constraint by a given amount therefore raises the objective by the same amount. 
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Figure 2 shows how the wage rate and ratio UIV respond to a shift 
in factor supply. Suppose that the supply of labor shifts from the 
relation in (14) to L = Ln(rW v ) a , where Lo > L n . At each ratio W V IW V , 
the ratios L/J and UIV will be higher, shifting the supply curve to the 
right, from Si to S 2 - With the wage rate remaining at its original value 
w 1 , the market equilibrium moves from the efficient point A to the 
new market equilibrium at point B, with a higher ratio UIV and a 
lower wealth ratio W V IW V . Efficiency requires, however, that the 
wealth ratio be reduced even further. At point B, each firm can profit 
by lowering the wage rate. The gain from lower labor costs more than 
makes up for the greater vacancy costs arising from a lower ratio of 
job seekers to vacancies at the firm. When all firms pursue this policy, 
the wage rate declines to w 2 and the line for the wealth ratio falls until 
it goes through point C, the new efficient equilibrium. The greater 
supply of labor lowers the economic welfare of workers in two ways. 
First, the higher unemployment rate reduces the wealth level Wu- 
Second, adjustment toward efficiency requires that workers be made 
even worse off by a reduction in the wage rate, shifting the wealth 
ratio W a /W v downward. 6 

8 An exception to the results in fig. 2 occurs if the supply of jobs is perfectly elastic at 
a fixed level for rW v . In this case, the number of jobs will expand or contract in 
proportion to the number of workers. After a shift in labor supply, the supply curve in 
fig. 2 will eventually return to its former position, with no change in the long-run ratio 
UIV or Wt/Wy or in the wage rate. 
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The model of efficient wage determination developed here differs 
significantly from many other wage determination models. Contrary 
to the standard supply and demand model, the presence of unem¬ 
ployment does not drive the wage down indefinitely. Instead, equilib¬ 
rium unemployment arises that is consistent with a stable wage. A 
shift in the supply of labor leads not only to an adjustment in the wage 
rate, in the same direction indicated by the supply and demand 
model, but also to an adjustment in the unemployment rate. 

In the neoclassical model, the wage rate in equilibrium equals the 
marginal product of labor. In the model developed here, however, 
the marginal product of labor in a match is not defined because of 
the fixed proportions technology. However, a marginal product is 
defined for the economy as a whole, and the result at an efficient 
point is analogous: W v — MP L ; that is, the wealth of an unemployed 
worker equals the expected present discounted value of the worker’s 
future contribution to production. Essentially, the meeting technol¬ 
ogy, as determined by the interview function p(U, V), replaces the 
production technology in the determination of wage rates. 

Contrary to the neoclassical theory, the implications for factor 
shares are unambiguous. In the model developed here, the wealth of 
all workers, employed and unemployed, is EW F + UW { ,. With (6) and 
(7), it can be shown that in the absence of fees EW F + UW V = Ew/r. 
Similarly, the wealth of all employers is EW F + VW v = E(y - w)/r. 
The ratio of worker wealth to employer wealth is then 

EW e + UWg = w 

ew f + vw v y - w . ’ 

This wealth ratio depends only on the wage rate w. Whenever the 
supply curve shifts to the right, the wage declines, lowering labor’s 
share of total wealth. Similarly, a leftward shift in the supply curve 
would raise labor’s share. 7 

In Diamond’s (1982) model, the wage is determined so that the 
surplus value from a match is split, following a Nash bargaining solu¬ 
tion. Diamond then studies the effects on the wage rate of benefits 
received by a worker while unemployed or by an employer when a job 
is vacant. In the model developed here, the surplus going tc^a worker 
divided by the surplus going to the employer is given by 

W E -Wu_ UW V _ up, 

W F - W v VWv Vp 2 ' ( 


These resulu are derived with a fixed proportions production technology. Re¬ 
sponse of factor shares could be different in a model with greater elasticity of substitu¬ 
tion in production. 
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Unless the curve p\lp% is a unitary elastic function of the ratio U/V in 
figure 2, the division of the surplus from the match will change as the 
supply curve shifts. Even when the p\lp% curve is unitary elastic, the 
division of the surplus may not be equal. 

Another model of wage determination is the sequential bargaining 
solution developed by Rubinstein and Wolinsky (1985). In their 
model, a wage bargain would depend on the degree of impatience, 
the length of the bargaining period, and the relative numbers of 
the two types of agents. Neither the Nash equilibrium solution nor the 
sequential bargaining solution necessarily yields efficiency in the 
economy. The same market forces described in this section may then 
operate to lead firms to make fixed wage offers that would affect the 
numbers of job seekers for a vacancy, bypassing the bargaining and 
eventually leading to an efficient wage. 8 

VI. Conclusions 

In the model developed here, the market equilibrium may generate a 
ratio of unemployed to vacancies that is inefficient. The extra produc¬ 
tion from an additional worker (or additional job) would then be 
more than sufficient to compensate an extra worker (or a firm with an 
extra job) for entering the market. The condition for efficiency is that 
WfjIWv ~ p\lp‘i', that is, the ratio of wealths for an unemployed worker 
and a vacancy should equal the ratio of their marginal contributions 
to interviews. As this result suggests, the externalities that arise in the 
model can be explained by the absence of a market for interviews. 
While an arbitrary wage level may result in an inefficient market 
equilibrium, market forces exist that would move the economy to¬ 
ward efficiency. They operate when there is sufficient information 
available to workers about individual firm interview likelihoods, fees, 
and wages. Then firms are led to charge or collect fees or alter their 
wage rates, in order to affect job seeker likelihoods. These changes 
transfer income from workers to firms or from firms to workers un¬ 
til the efficiency condition is satisfied. If the wage mechanism oper¬ 
ates, the model shows how the wage rate, the distribution of income, 
and the ratio of unemployed to vacancies respond to shifts in the sup¬ 
ply of labor. 

The public policy conclusions of the model are straightforward. 
First, suppose that the economy begins with an inefficient ratio UIV 
and that neither the fee nor wage mechanisms operate (or operate too 
slowly). Then taxes and transfers can accomplish exactly the same 


8 In a sectoral model, Frank (1985) suggests that firms could commit themselves to 
wage and employment strategies as an alternative to bargaining. 
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adjustments that the efficient fee and wage mechanisms generate. If, 
for example, there is too much unemployment and the ratio U/V is 
too high, then taxing workers and subsidizing firms will shift the 
wealth ratio WulW v down in figure 1 until it goes through point C. 
However, the distributional effects of such a policy may make it ob¬ 
jectionable. It is not a Pareto improvement and instead worsens 
the economic conditions of workers. When excessive unemployment 
arises, the higher unemployment reduces the economic wealth of 
unemployed workers but not enough to bring about efficiency. The 
efficient solution requires that their condition be reduced even fur¬ 
ther in order to discourage their participation in the labor market. It 
is hard to imagine that a policy of this nature would be actively pur¬ 
sued. Policies that instead subsidized exit from the labor market or 
alternative non—labor market activities could accomplish the same 
goal by shifting the supply curve of labor without worsening the con¬ 
ditions of the unemployed. When vacancies are excessive, of course, 
the simple tax and transfer policy would require that firms be made 
worse off; this redistributional policy may also be unacceptable. 

Now suppose that the wage mechanism always operates to move the 
economy to efficiency. Then the mechanism will counter the effects of 
any tax and transfer policy. For example, suppose that the economy 
begins at an efficient ratio UIV. A tax on output is then used to 
subsidize workers (equivalent to an increase in the wage rate). This 
policy initially shifts the W [r /W v curve upward. But the wage mecha¬ 
nism would return the curve to its former level by a wage reduction 
equal to the transfer. The tax and transfer policy would have no real 
effects on the economy. 

One limitation of the model developed here is that it assumes a 
fixed proportions technology. While this assumption simplifies the 
analysis, it raises the question of whether results would be different 
with a neoclassical production technology. With such a technology, 
the costs to a firm of leaving a marginal position vacant are zero, and 
one more worker could always be added. Vacancies could then be 
redefined as advertised positions. The firm would need to maintain 
these in order to replace workers that leave because of turnover. 

The model developed here provides an alternative way of analyzing 
key macroeconomic variables, in particular the unemployment and 
wage rates. The major conclusion is that in general there is no natural 
rate of unemployment, contrary to the new classical macroeconomic 
models. The unemployment rate can be affected by shifts in the sup¬ 
ply of labor, tax and transfer policies, and unemployment compensa¬ 
tion. Financial market&xan also affect the equilibrium unemployment 
and wage rates. If the discount rate declines, the wealth floiws rW E and 
rW F decline while the flows rW v and rW v increase. These in turn will 
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lead to new equilibrium levels of unemployment, wages, and produc¬ 
tion. Further conclusions would require a complete specification of 
financial markets and their relation to the supply of jobs given in (15). 
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Do Bad Bidders Become Good Targets? 


Mark L. Mitchell and Kenneth Lehn 

Securities and Exchange Commission 


This paper empirically examines one motive for takeovers: to 
change control of firms that make acquisitions that diminish the 
value of their equity. Firms that subsequently become takeover 
targets make acquisitions that significantly reduce their equity value, 
and firms that do not become takeover targets make acquisitions that 
raise their equity value. Within the sample of acquisitions by targets, 
the acquisitions that reduce equity value the most are those that are 
later divested either in bust-up takeovers or restructuring programs 
to thwart the takeover. This evidence is consistent with theories ad¬ 
vanced by Marris, Manne, and Jensen concerning the disciplinary 
role played by takeovers. 


I. Introduction 

Since Berle and Means (1933), it has been widely recognized that a 
potential divergence of interest exists between managers and stock¬ 
holders in corporations characterized by diffusely held equity. In re¬ 
cent years, economists have probed institutional arrangements that 
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mitigate this potential conflict and attempted to understand why these 
arrangements vary from firm to firm. Among the forces that mitigate 
the manager-stockholder conflict are competitive labor and product 
markets, managerial compensation plans, the structure of equity own¬ 
ership, and the threat of corporate takeovers. 1 This paper focuses on 
the extent to which one of these forces, corporate takeovers, disci¬ 
plines managers in firms with a specific type of manager-stockholder 
conflict, namely, a conflict concerning the firms’ acquisition pro¬ 
grams. 

Our interest in this question emerges from general theoretical ar¬ 
guments by Marris (1963) and Manne (1965) and a more specific 
argument by Jensen (1986). Independently, Marris and Manne argue 
that the stock prices of firms in which managers deviate from profit 
maximization are less than they otherwise could be. They argue that 
this difference between actual and potential stock prices creates in¬ 
centives for outside parties to acquire these firms and operate them in 
profit-maximizing ways. 

More recently, Jensen argues that takeovers mitigate manager- 
stockholder conflicts that are especially severe in firms that generate 
substantial free cash flow (i.e., cash flow in excess of what is necessary 
to finance positive return investment projects). 2 Jensen asserts that 
managers in such firms often use free cash flow to finance un¬ 
profitable ventures, such as value-reducing acquisitions, rather than 
pay it out to stockholders in either dividends or stock buy-backs. 
Therefore, according to Jensen, takeovers are not only a "problem” 
but also a “solution.” He argues that many takeovers are designed, at 
least in part, either to undo previous unprofitable acquisitions by 
target firms or to prevent these firms from making future unprofit¬ 
able acquisitions. 

Although numerous studies of corporate takeovers have docu¬ 
mented that stockholders of target firms benefit from these transac¬ 
tions, 3 little evidence exists on the extent to which these gains reflect 
the reduction of agency problems of the type discussed by Marris, 
Manne, and Jensen. In a 1983 review article, Jensen and Ruback 


1 See, among others, Marris (1963), Manne (1965), Alchian and Demselz (1972), 
Jensen and Meckling (1976), Fama (1980), Demsetz (1983), Fama and Jensen (1983a, 
19836), the June 1983 issue of the Journal of Law and Economics, Demsetz and Lchn 
(1985), Murphy (1985), .and Jensen (1986). 

2 Lehn and Poulsen (1989), Mahle (1989), Maloney, McCormick, and Mitchell (1989), 
and Lang, Stulz, and Walkling (in press) provide empirical evidence consistent with 
Jensen’s hypothesis. 

* For a review of the literature on the effects of takeovers on stock prices, see Jensen 
and Ruback (1983) and Jarrell, Brickley, and Neuer (1988). Several studies also show 
that legislation that restricts takeovers adversely affects stock prices (see Ryngaert and 
Netter 1988; Schumann 1988; Mitchell and Netter, in press). 



374 JOURNAL OF POLITICAL ECONOMY 

conclude that "knowledge of the source of takeover gains still eludes 
us” (p. 47); a more recent survey article by Jarrell et al. (1988) con¬ 
curs. To partially fill this gap of knowledge, this study addresses a 
testable question that emerges from Marris, Manne, and Jensen: Are 
takeover targets distinguished from other firms by the profitability of 
their prior acquisitions? Specifically, do target firms, relative to other 
firms, systematically make acquisitions that the stock market judges 
harshly? 

Anecdotal evidence suggests that the raison d'etre of some takeovers 
is the poor acquisition record of target firms. For example, one stated 
motive of Sir James Goldsmith’s unsuccessful hostile takeover attempt 
of Goodyear Tire and Rubber Company in October 1986 was his 
desire to sell Goodyear’s petroleum and aerospace divisions and con¬ 
centrate Goodyear's attention on its tire and rubber operations. 
Goldsmith offered a premium of approximately $1.13 billion 
(roughly 30 percent of the preoffer equity value of Goodyear). 

Although Goodyear, originally a tire and rubber company, had 
diversified into the aerospace business earlier, its 1983 purchase of 
Celeron Oil for approximately $800 million was its first major petro¬ 
leum acquisition. On the day of the acquisition announcement, Feb¬ 
ruary 8, 1983, Goodyear’s stock price suffered an abnormal decline of 
10.04 percent, resulting in a loss of $249 million for Goodyear stock¬ 
holders. Over a narrow event window surrounding the announce¬ 
ment (5 trading days before the announcement through 1 trading day 
after the announcement), Goodyear’s stock price declined 14.83 per¬ 
cent, resulting in shareholder losses of $359 million. 4 The premium 
offered by Goldsmith may have recouped losses sustained by Good¬ 
year shareholders 3 years earlier when Goodyear began its diver¬ 
sification into the oil industry. 5 Goodyear successfully defeated Gold¬ 
smith’s takeover attempt, but its stock price did not fall to the 
preoffer level since it instituted a major restructuring program that 
was similar to the one that Goldsmith promised. Not surprisingly, the 


1 Over a longer window surrounding this announcement (5 trading days before the 
announcement through 40 trading days after the announcement), Goodyear’s stock 
price declined 23.68 percent. 

* The Standard & Poor's 500 index increased by approximately 61 percent from 5 
days before the first announcement of Goodyear’s acquisition of Celeron Oil in Febru¬ 
ary 1983 through 20 days before the first announcement of Goldsmith’s bid for Good¬ 
year in October 1986. If the shareholder losses ($249 million [day of announcement], 
$359 million [-5, 1 window], and $573 million [-5, 40 window]) associated with 
Goodyear's energy acquisition had been invested in the S & P 500 during this period, 
they would have increased to $401 million, $578 million, and $923 million, respectively. 
Hence, the premium ofKfcd by Goldsmith would have restored much of the equity 
value in Goodyear that had been depreciated earlier when Goodyear's management 
made these acquisitions. 


/ 
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restructuring program included the sale of a substantial part of Cel¬ 
eron Oil. 

To determine whether the Goodyear case generalizes to a large 
sample of takeovers, we examine the stock price reactions to acquisi¬ 
tions made by two sets of firms during 1982-86: firms that become 
targets of takeover attempts after their acquisitions (i.e., “targets") 
and a control group of firms that do not receive takeover bids during 
the sample period (i.e., “nontargets”). Within the sample of targets, 
we estimate the stock price effects associated with acquisitions of firms 
that later receive hostile bids (i.e., “hostile targets”) and acquisitions of 
firms that later receive friendly bids (i.e., “friendly targets"). The 
following results are revealed. 

1. For the entire sample, the average stock price effect associated 
with acquisition announcements is not significantly different from 
zero: 0.14 percent measured over the period of 5 trading days before 
the announcements through 1 trading day after the announcements 
([-5, 1] window), and 0.70 percent measured over the period of 5 
trading days before the announcements through 40 trading days af¬ 
ter the announcements ([-5, 40] window). 

2. Significant differences exist between the average stock price ef¬ 
fect associated with acquisitions made by targets and the corre¬ 
sponding effect associated with acquisitions made by nontargets. The 
stock prices of targets decline significantly when they announce acqui- 
sitions (-1.27 percent over the [-5, 1] window and -3.38 percent 
over the [ - 5,40] window), and the stock prices of nontargets increase 
significantly when they announce acquisitions (0.82 percent and 3.32 
percent, respectively). Within the sample of targets, this stock price 
effect is similar in magnitude for hostile targets (-1.34 percent and 
-3.37 percent, respectively) and friendly targets (-1.17 percent and 
-3.39 percent, respectively). 

3. For the entire sample of acquisitions, the average stock price 
effect associated with acquisitions that subsequently are divested is 
significantly lower (-1.53 percent and -4.01 percent, respectively) 
than the corresponding stock price effect associated with acquisitions 
that are not subsequently divested (0.56 percent and 1.89 percent, 
respectively). This difference is especially striking for the sample of 
acquisitions by target firms. The average stock price effect associated 
with acquisitions made by targets that subsequently are divested fol¬ 
lowing the reception of their bids, either by their acquiring firms or by 
themselves, is -2.07 percent and -7.04 percent, respectively; the 
corresponding average stock price effect associated with other acqui¬ 
sitions made by targets is - 0.72 percent and - 0.87 percent, respec¬ 
tively. 

4. Estimates from a logit equation reveal that, with equity value and 
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the percentage of equity held by management held constant, the 
probability that a firm is a target, especially a hostile target, during 
1982-88 is inversely and significantly related to the stock price effects 
associated with announcements of the firm’s acquisitions: the more 
negative these effects, the higher the likelihood of a subsequent 
takeover attempt. 

These results suggest that one source of value in many corporate 
takeovers, especially hostile takeovers, is recoupment of target equity 
value that had been lost because of the targets’ poor acquisition strate¬ 
gies prior to the reception of their bids. They also support the argu¬ 
ment that hostile bust-up takeovers promote economic efficiency by 
reallocating the target’s assets to higher-valued uses. Hence, these 
results support the theories by Marris, Manne, and Jensen concerning 
the disciplinary role of corporate takeovers. Additionally, the divesti¬ 
ture findings suggest that when companies announce acquisitions, the 
stock market provides an unbiased forecast of the likelihood that the 
assets will be ultimately divested. 


II. Description of Data 

The sample for this study consists of 1,158 public corporations in 51 
industries covered by Value Line during the fourth quarter of calen¬ 
dar year 1981. 6 The modified sample excludes two highly regulated 
industries (financial services and electric utilities) covered by Value 
Line and industries that contain fewer than 10 firms. The sample 
includes 64.4 percent of the companies in the 1981 S & P 500 index 
and 75.2 percent of the companies in the 1981 Fortune 500. 

Each of the 1,158 firms was classified into one of four groups bn the 
basis of whether the firm was a takeover target during January 1980- 
July 1988: (1) nontargets, (2) hostile targets, (3) friendly targets, and 
(4) miscellaneous firms. Table 1 displays a frequency distribution for 
these four groups. The 600 nontarget firms (51.8 percent of the 
sample) did not receive friendly or hostile bids, pay greenmail, file for 
bankruptcy, significantly restructure, or become subject to large un¬ 
solicited open-market purchases. The hostile target group consists of 
228 firms (19.7 percent) that were targets of successful and unsuccess¬ 
ful hostile tender offers, proxy contests (in which the dissenting 
dwreholder sought control), and large unsolicited open-market pur¬ 
chases in which the purchaser attempted to secure control. The 
friendly target group contains 240 firms (20.7 percent of the sample) 

* Every quarter, Value'thu examines the financial prospects of approximately 1,500 
firms in more than 65 industries. Each week during every quarter, it publishes a 
financial summary for a subset of the firms that it covers. 
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TABLE 1 


Frequency of Different Types of Control Transactions, 1980-88 



Number 

Percentage 

Nontargets (N = 600): 

No control transaction 

600 

51.8 

Hostile targets (N = 228): 

Successful hostile tender offer 

58 

5.0 

Unsuccessful hostile tender offer 

69 

6.0 

Unsuccessful hostile tender offer, followed 
by merger 

51 

4.4 

Unsuccessful hostile tender offer, followed 
by leveraged buy-out 

21 

1.8 

Unsuccessful hostile tender offer, ending 
in greenmail 

7 

.6 

Proxy fight 

21 

1.8 

Unsolicited large open-market purchase 

1 

.1 

Friendly targets (N = 240): 

Merger or friendly tender offer 

163 

14.1 

Leveraged buy-out 

53 

4.6 

Unsuccessful leveraged buy-out 

10 

.9 

Unsuccessful leveraged buy-out, followed 
by merger 

14 

1.2 

Miscellaneous (N = 90): 

Greenmail, without a tender offer 

11 

.9 

Large targeted repurchase (possible greenmail) 

10 

.9 

Large open-market purchases 

16 

1.4 

Bankruptcy filings or NYSE suspensions 

26 

2.2 

Significant corporate restructuring 

27 

2.3 

Total sample 

1,158 

100.0 


that were targets of successful and unsuccessful friendly tender of¬ 
fers, mergers, and leveraged buy-outs. The miscellaneous category 
contains 90 firms (7.8 percent) that paid greenmail (without a tender 
offer), filed for bankruptcy, were subject to large open-market pur¬ 
chases in which the purchaser expressed no interest in securing con¬ 
trol, made large targeted stock repurchases (without a takeover at¬ 
tempt), or significantly restructured (without a takeover attempt). 

The Dow Jones Broadtape was then examined for announcements 
of acquisitions by the 1,158 firms during 1982-86, including acquisi¬ 
tions of other public companies, acquisitions of private companies, 
and purchases of assets, divisions, subsidiaries, and stock of other 
companies. 7 We limit the sample to acquisitions in which the disclosed 

7 Both the New York Stock Exchange (NYSE) and American Stock Exchange 
(AMEX) require member firms to disclose to Dow Jones any information such as an 
acquisition that might be expected to significantly affect their stock prices. Dow Jones 
transmits the disclosed information across the Broadtape to subscribers across the 
country. Subsequent editions of the Wall Street Journal include most of the Broadtape 
stories. 
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purchase price was at least 5 percent of the market value of the 
acquiring firm’s common equity 20 trading days prior to the first 
announcement of the acquisition. 

While we examine the stock price effects of acquisitions made dur¬ 
ing 1982-86, we record control transactions for the firms in the sam¬ 
ple from 1980 through July 1988. Extending control transactions 
beyond the last year in the acquisition sample allows time for a 
takeover attempt to occur in response to prior value-reducing acquisi¬ 
tions. In addition, some firms may not have received a takeover offer 
during 1982-88, but received an offer during 1980-81. Acquisitions 
by these firms during 1982—86 will have occurred after takeover of¬ 
fers for themselves. Since the objective of this study is to examine the 
extent to which the market for corporate control disciplines firms that 
make value-reducing acquisitions, it seems inappropriate to include 
these acquisitions in the target category. It also seems inappropriate, 
however, to include these acquisitions in the nontarget category for 
obvious reasons; hence, we opt for including these acquisitions in the 
miscellaneous category. 8 

Although the first three groups of firms are well defined, the 
fourth group is a residual group lying between target firms and non¬ 
target firms. We examine the stock price reactions to acquisition an¬ 
nouncements by all four groups, although our principal interest lies 
in differences in the stock price effects of acquisitions made by the 
first three groups. 

Table 2 lists the number of firms that made acquisitions and the 
number of acquisitions that they made for the four groups, the entire 
sample, and all target firms. The data in this table reveal that most 
firms in the sample did not make acquisitions of an amount that was at 
least 5 percent of their equity value during 1982-86. During this 
period, 280 firms (24 percent of the sample) made 401 acquisitions. 
Included in this sample of acquisitions are (1) 232 acquisitions by 166 
nontargets (28 percent of all nontargets), (2) 113 acquisitions by 77 
targets (16 percent of all targets), (3) 70 acquisitions by 48 hostile 
targets (21 percent of all hostile targets), (4) 43 acquisitions by 29 
friendly targets (12 percent of all friendly targets), and (5) 56 acquisi¬ 
tions by 38 miscellaneous firms (42 percent of all miscellaneous firms). 

Although these data might appear to indicate that nontarget firms 
make acquisitions more often than targets, it is inappropriate to com- 

8 For example, Houston Natural Gas made two acquisitions in November 1984, after 
successfully defeating a hostile tender offer by Coasj^ Corp. earlier in the year. In 
1985, Houston NaturafKJas merged with Internor&yAlthough the acquisitions by 
Houston Natural Gas could be dassified as acquisitiotmM a friendly target, we chose to 
classify them as miscellaneous acquisitions since they KlpVwed an unsuccessful hostile 
bid. The empirical results are invariant with respect fejSiis decision. 



TABLE 2 

Summary Statistics for Acquisitions, 1982-86 



that made up the various forms of financing for each category. 
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pare these frequencies directly since the sample period is effectively 
longer for nontargets than for targets. For target firms, we record 
only acquisitions made from January 1982 through 3 months prior to 
the first announcement of their suitor’s interest in acquiring control. 
For example, if the first announcement of a bid for a target firm 
occurred in June 1984, we record acquisitions for only two years 
(1982 and 1983) and part of one year (January 1984-March 1984). 
One conclusion, however, does emerge directly from table 2. Since 79 
percent of the hostile targets did not make a large acquisition during 
the period preceding the reception of their bids, at best, the bad- 
bidder explanation of hostile takeovers can explain only part of the 
reason for these transactions. 

Table 2 also displays the mean and median ratio of the purchase 
price of the acquisitions to the equity value of the acquiring firms for 
the six groups of firms. For the entire sample, the mean value of this 
ratio is 0.37. The mean value of the acquisition relative size variable 
for the subgroups ranges from 0.30 for the hostile target category to 
0.44 for the miscellaneous category. As with the full sample, the mean 
ratio exceeds the median ratio for every subgroup. Tests for differ¬ 
ences in means indicate that the mean ratios for the various groups 
are not significantly different from one another and thus make com¬ 
parable the empirical results reported in the next section. 

Several modes of payments can be used in making acquisitions. For 
each acquisition, we collected the form of payment from Mergers and 
Acquisitions and the Wall Street Journal. The data indicate that pure 
cash offers are the predominant form of payment, accounting for 254 
(63 percent) of the acquisitions. At least some cash is used in 354 (88 
percent) of the acquisitions. 9 In contrast, pure stock transactions ac¬ 
count for only 45 (11 percent) of the acquisitions, and at least some 
stock is used in 103 (26 percent) of the acquisitions. 

Asquith, Bruner, and Mullins (1987) and Travlos (1987) find that 
form of payment is correlated with the market’s reaction to an acquisi¬ 
tion announcement. For a sample of 343 mergers and tender offers 
that occurred during 1973-83, Asquith et al. observe positive stock 
price reactions to acquiring firms for cash offers and negative reac¬ 
tions for stock offers. Travlos reports insignificant stock price reac¬ 
tions tq .acquiring firms for cash offers and negative reactions for 
stock offers for a sample of 167 mergers and tender offers during 


9 The “other" category includes 12 acquisitions partly financed by cash. This category 
is composed of cash, stock, and notes (8); cash notes and assumption of target debt (1); 
notes (1); cash and assettf)^, cash and debentures (2); and stock and assumption of tar¬ 
get debt (1). 
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1972-81. 10 Among nontargets, hostile targets, and friendly targets, 
for which comparison will be made in the following sections, the 
frequency distributions of form of payment do not differ signi¬ 
ficantly. The form of payment for the miscellaneous category does 
differ somewhat from the other categories since the miscellaneous 
category contains relatively fewer pure cash acquisitions. Since the 
primary focus of this study is to distinguish between acquisitions by 
target firms and acquisitions by nontarget firms, the form of payment 
should not bias the comparisons reported in the remainder of the 
paper. 


III. Stock Market Analysis of Acquisitions 

A. Event-Study Methodology 

We employ event-study methodology to measure the stock price ef¬ 
fects associated with announcements of acquisitions. Using the Center 
for Research in Security Prices (CRSP) daily returns tapes, we esti¬ 
mate the abnormal return (ar„) for each acquiring firm during the 
period 20 days preceding the event date through 40 days following 
the event date. Abnormal returns are computed as 

ar u = r„ -A, - $,r mt , 


where r u is the return to firm i at time t, r mt is the return to the CRSP 
value-weighted index of NYSE and AMEX stocks, and &, and (3, are 
market model parameter estimates from the period 170 through 21 
trading days preceding the event date. 

The event date for each acquisition is the first date on which the 
Dow Jones Broadtape reports a story about the acquisition. These 
initial stories range from reports that the acquiring firm is rumored to 
be interested in making the acquisition, often with no price disclosed, 
to reports that both the bidder and the target definitively agreed to 
the acquisition. We then average the daily abnormal returns across 
firms in each group to obtain the portfolio abnormal return, AR< — 
1 orJN, where N is the number of firms in each portfolio of inter¬ 
est, and cumulate over various windows to obtain the cumulative ab¬ 
normal return, CAR = , AR ( , where T is the length of the event 


10 We find a higher proportion of cash offers in our study than Asquith et al. and 
Travlos do. They focus on mergers and tender offers, whereas our study examines ail 
acquisitions, including purchases of assets and divisions, which are generally cash of¬ 
fers. 
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window. In the absence of abnormal performance, the expected value 
of the AR and CAR equals zero. 11 

In the tables that follow, we report the AR for the acquisition an¬ 
nouncement date [0] and the CAR for the corresponding four win¬ 
dows: (1) 1 day before the event date through 1 day after the event 
date, [-1, 1], (2) 5 days before the event date through 1 day after the 
event date, [ - 5, 1], (3) 5 days before the event date through 40 days 
after the event date, [-5, 40], and (4) 20 days before the event date 
through 40 days after the event date, [ - 20, 40]. 


B. Stock Price Performance of Acquiring Firms 

Table 3 displays the announcement day AR and corresponding CARs 
(z-statistics are in parentheses, with the percentage positive listed be¬ 
low z-statistics) associated with the announcements of acquisitions 
made by each of six groups of firms. In addition to examining the 
stock price effects of the four groups discussed earlier (hostile targets, 
friendly targets, nontargets, and miscellaneous), we also report the 
stock price effects associated with acquisitions by all targets and by the 
entire sample. 

The announcement day AR corresponding to the acquisitions 
made by the entire sample of 401 acquisitions is - 0.21 percent and is 
statistically significant at the .05 level. The CARs corresponding to the 
other four windows range from -0.08 percent ([-1, 1]) to 0.70 per¬ 
cent ([-5, 40]), and none of these is significantly different from zero. 
These results suggest that, on average, acquiring firms earn a normal 
rate of return on their investments, a finding consistent with a com¬ 
petitive market for corporate control. 12 


1 We construct standardized lest statistics to assess the statistical significance of stock 
market abnormal performance. We divide each abnormal return by the square root of 
its forecast variance: 


tt.r 



dC - RJ* 1 1' 1,2 
CSSR„ J] 


(where tr* is the estimated residual variance for the estimation period, L is the number 
of observations in the estimation period, R m is the estimation period mean of the 
market return, and CSSR^ is the corrected sum of squares of the market return during 
the event window), to form a standardized abnormal return, sor„ = arJo ar . The test 
statistic for the AR is Z, = (VS/N) , sar u , and the test statistic for the CAR is (l/VT) 
1 Z„ where T is the length of the event window. We also conduct nonparametric tests 
to test the robustness of the results reported. These tests include a test for the percent¬ 
age of the abnormal returns that are positive and the Wilcoxon signed rank test. The 
statistical significance of the results reported throughout the text is robust with respect 
to these nonparametric Mats. As with all other results mentioned but not reported in 
the text, they are available on request. 

1! See Bradley, Desai, and Kim (1988) and Jarrell and Poulsen (1969) for studies of 
returns to acquiring firms and target firms in tender offers. 
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Abnormal Stock Market Performance Associated with Firms Announcing 
Acquisitions during 1982-86 


Event Window 


Categorv 

[0] 

[-1. 1] 

1-5, 1) 

[-5,40] 

t - 20, 40] 

Entire sample 
(N m 401) 

-.21** 

(-2.18) 

42.39 

-.08 

(-.45) 

46.63 

.14 

(.53) 

50.37 

.70 

(1.05) 

54.11 

.57 

(.75) 

53.12 

Nontargets 
(N = 232) 

.09 

(.66) 

44.83 

.49** 

(2.19) 

48.71 

.82** 

(2.42) 

56.03 

3.32*** 

(3.80) 

62.93 

3.48*** 

(3.46) 

62.07 

All targets 
(N = 113) 

- .78*** 
(-4.59) 
39.82 

- .93*** 
(-3.16) 
39.82 

-1.27*** 

(-2.82) 

38.05 

-3.38*** 

(-2.93) 

38.05 

-3.46*** 

(-2.60) 

38.94 

Hostile targets 
(N = 70) 

-.95*** 

(-4.64) 

38.57 

-1.50*** 

(-4.22) 

37.14 

-1.34** 

(-2.46) 

35.71 

- 3.37** 
(-2.42) 
38.57 

-3.19** 

(-2.00) 

40.00 

Friendly targets 
(A = 43) 

-.50* 

(-1.68) 

41.86 

-.01 

(-02) 

44.19) 

-1.17 

(-1.47) 

41.86 

- 3.39* 
(-1.67) 
37.21 

-3.91* 

(-167) 

37.21 

Miscellaneous 
(N = 56) 

-.31 

(-1.01) 

37.50 

-.69 

(-1.32) 

51.79 

.14 

(.17) 

51.79 

-1.93 

(-.94) 

50.00 

-3.33 

(-1.42) 

44.64 


Note —^-statistics are in parent hears and percentage abnormal returns that are positive are listed below z- 
sutistics. 

* Significant at the 10 percent level 
** Significant at the 5 percent level. 

*** Significant at the 1 percent level 


The results listed in table 3, however, reveal that the stock price 
effects associated with announcements of acquisitions made by target 
firms differ significantly from the stock price effects associated with 
announcements of acquisitions made by nontarget firms. The an¬ 
nouncement day AR associated with 113 acquisitions made by all 
target firms is - 0.78 percent. The CAR ranges from - 3.46 percent 
([ - 20,40]) to - 0.93 percent ([-1,1]). All these estimates are statisti¬ 
cally significant at the .01 level. Furthermore, the CAR becomes signi¬ 
ficantly more negative when the event window extends beyond the 
acquisition announcements. These data indicate that the market 
reacted negatively to the initial announcements of these acquisitions 
and suggest that as the market learned more about these acquisitions 
during the succeeding weeks (e.g., purchase price, definitiveness of 
the acquisition, and resulting synergy), the market further devalued 
the acquiring firms. 

Within the group of target firms, the results are especially signi¬ 
ficant for hostile targets. The announcement day AR associated with 
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70 acquisitions made by hostile targets is — 0.95 percent, and the CAR 
ranges from - 3.37 percent ([ - 5,40]) to - 1.34 percent ([ - 5, 1]). All 
the estimates are significant at the .05 level or higher. These results 
compare with the corresponding results for acquisitions by friendly 
targets. The AR on the announcement day for 43 acquisitions made 
by friendly targets is - 0.50 percent and is significant at the . 10 level. 
The CAR ranges from -3.91 percent ([-20, 40]) to -0.01 percent 
([-1,1]) and is significant at the . 10 level for the two longest windows. 
The pattern of returns is similar for both hostile targets and friendly 
targets; both sets of returns become considerably more negative when 
the event window extends beyond day 1. 

The abnormal stock price performance associated with 232 acquisi¬ 
tions made by nontarget firms contrasts sharply with the results for 
target firms. The announcement day AR for nontarget firms is 0.09 
percent, and the CAR ranges from 0.49 percent ([-1, 1]) to 3.48 
percent ([-20, 40]). With the exception of the announcement day 
AR, all these estimates are significant at the .05 level or higher. The 
CAR for nontargets increases and remains statistically significant 
when the event window extends beyond the day after the announce¬ 
ment of the acquisitions, a result that contrasts sharply with the corre¬ 
sponding result for target firms. 

Finally, the results reveal that the 56 acquisitions made by the 
group of miscellaneous firms had no statistically significant effect on 
their stock prices, regardless of the window used to measure the ef¬ 
fect. The announcement day AR for these firms is -0.31 percent, 
and the CAR for the four other windows ranges from - 3.33 percent 
([-20, 40]) to 0.14 percent ([-5, 1]). 

The empirical results from table 3 indicate that the stock market 
negadvely values acquisitions by firms that become takeover targets, 
especially hostile targets, whereas it positively values acquisitions by 
firms that never did become takeover targets during the sample pe¬ 
riod. Figure 1 graphically depicts the difference in the serial pattern 
of CARs for hostile targets, friendly targets, nontargets, miscellane¬ 
ous firms, and the entire sample for the [ - 5, 40] window. Though 
the difference in abnormal stock price performance between non¬ 
targets and the other groups is obvious, no difference appears to exist 
among the hostile target, friendly target, and miscellaneous groups. 
Recall, however, that the estimates are statistically significant in all 
event windows for the hostile target category, but not significant in 
two of the windows for the friendly target category and any of the 
windows for the miscellaneous category. 

The results in table 3 imply that the difference in abnormal stock 
returns associated with acquisitions by targets and nontargets also is 
significant. Table 4 lists the differences in the announcement day AR 
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TABLE 4 


Differences in Abnormal Returns Associated with Announcements of 
Acquisitions, 1982-86 


Comparison of 
Abnormal Returns 


Event Window 


[0] 

1-1. 1] 

[-5, U 

[-5,40] 

[-20,40] 

All targets 

.87*** 

1.42*** 

2.09*** 

6.70*** 

6.94*** 

vs. nontargets 

(3.99) 

(3.84) 

(3.71) 

(4.63) 

(4.16) 

Hostile targets 

1.04*** 

1.99*** 

2.16*** 

6.69*** 

6.67*** 

vs. nontargets 

(4.23) 

(4.74) 

(3.37) 

(4.07) 

(3.54) 

Friendly targets 

.59* 

.50 

1.99** 

6.71*** 

7.39*** 

vs. nontargets 

(1.80) 

(.9!) 

(2.30) 

(3.04) 

(2.90) 


NOTE.—z-siatiitic* arc in parenthete*. 
• Significant at thr JO pert cm level. 

** Significant at the 5 percent level 
*** Significant at the 1 percent level. 


and corresponding CARs associated with acquisitions made by (a) all 
targets and nontargets, ( b ) hostile targets and nontargets, and (c) 
friendly targets and nontargets. 1 s The difference in stock price ab¬ 
normal performance is statistically significant at the .01 level for the 
comparisons involving all targets and nontargets, and hostile targets 
and nontargets, for all five event windows. The difference is 
significant for friendly targets and nontargets at the .10 level or 
higher for all but the [- 1, 1] window. 

C. Stock Price Performance of Acquiring Firms for 
Subsequently Divested Acquisitions 

The results presented so far show that, on average, targets, especially 
hostile targets, make acquisitions that diminish their stock prices, 
whereas nontarget firms make acquisitions that increase their stock 
prices. Two plausible explanations exist for the negative abnormal 
stock price performance associated with acquisitions made by targets: 
either targets systematically acquire assets that the market believes will 
reduce the combined operating profits of the acquired assets and 
themselves, or targets systematically overpay for acquisitions that the 
market believes will increase the combined operating profits of the 
acquired assets and themselves. Both explanations are consistent with 
the theory that takeovers often discipline panagers who do not max- 

V' 

ls Note that we do not compare targets or non targets with the miscellaneous cate¬ 
gory. Our aim is to compare the differences between targets and nontargets. In Sec. II, 
we observed that the firnjs. in the miscellaneous category do not belong in either the 
target or nontarget categories. Furthermore, as shown in table 3, the stock price effects 
for the miscellaneous category are not statistically different from zero for any of the 
event windows. 
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imize stockholders’ wealth. The former explanation’suggests that the 
motive behind many takeovers is to undo inefficient acquisitions pre¬ 
viously made by the targets. The latter explanation suggests that 
takeovers can serve to restrain managers in target firms from persis¬ 
tently overpaying for acquisitions in the future. 14 

Although it is difficult to measure the extent to which the latter 
explanauon prevails, the relative importance of the former explana¬ 
tion can be examined by comparing the rate at which acquisitions 
made by targets are subsequently divested with the corresponding 
rate for nontargets during the sample period. 15 The target divestiture 
sample consists of acquisitions made by targets that are divested dur¬ 
ing a period ranging from 3 months prior to the reception of their 
bids through the end of the sample period. The target divestiture 
sample includes divestitures by targets to defend against takeovers, 
divestitures as part of restructuring programs after defeating 
takeover attempts, and divestitures by acquiring firms following suc¬ 
cessful takeovers of the targets. The nontarget divestiture sample 
consists of acquisitions that are divested by the end of the sample 
period. 

If the former explanation is important, then the divestiture rate 
should be higher for targets than for nontargets. In addition, if the 
former explanation holds, then the abnormal returns associated with 
acquisitions that subsequently are divested should be significantly 
lower than the abnormal returns associated with acquisitions that are 
not subsequently divested. This relationship should hold not only for 
the group of targets but also for the entire sample and each of the 
subsamples. 

Data on subsequent divestitures of acquisitions in the sample come 
from four sources: annual issues of Mergers and Acquisitions, the Wall 
Street Journal Index , and Standard id Poor’s Directory of Corporate 
Affiliations during 1982-88 and telephone conversations with repre¬ 
sentatives of the acquiring companies themselves. 16 The data reveal 
that 81 of the 401 acquisitions during 1982-86, or 20.2 percent of the 
sample, were subsequently divested during 1982-88. 

M According to Roll (1986), bidder overpayment results from hubris on the part of 
managers of acquiring firms (see also Black 1989). 

15 Porter (1987) and Ravenscraft and Scherer (1987) show that acquisitions made by 
firms in conglomerate mergers during the 1960s and 1970s were subsequently divested 
at a high rate. Neither study, however, examines divestiture rates for target and non- 
target firms. 

16 Although we are confident that these four sources allow us to identify most divesti¬ 
tures, we are not certain that we have identified them all. Some divestitures may not 
have been reported in Mergers and Acquisitions, the Wall Street Journal Index, or Standard 
Id Poor's Directory of Corporate Affiliations, and in some cases, we were unable to receive 
definitive information from the companies themselves. We have no reason to believe 
that the unidentified divestitures bias the results above. 
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A significant difference in the divestiture rate exists between non¬ 
targets and targets. Whereas only 9.1 percent (21/232) of the acquisi¬ 
tions made by nontargets are subsequently divested, 40.7 percent (46/ 
113) of the acquisitions made by targets are subsequently divested, 
either in response to or following successful or unsuccessful takeover 
attempts (the z-statistic for the difference in divestiture rates is 6.34). 
No significant difference in divestiture rates exists between hostile 
targets and friendly targets; this rate is 41.9 percent (18/43) for acqui¬ 
sitions made by friendly targets and 40 percent (28/70) for acquisi¬ 
tions made by hostile targets (the z-statistic for the difference in dives¬ 
titure rates is 0.20). 17 

It is noteworthy that only two of the friendly targets and none of 
the hostile targets divested previously acquired units prior to the 
threat of a takeover. Since we are interested in whether one motive 
for takeovers is to undo acquisitions made by target firms, these two 
acquisitions made by friendly targets are excluded from the friendly 
target divestiture sample. 

Table 5 displays abnormal returns for two sets of acquisitions for 
each group of firms: acquisitions that subsequently were divested dur¬ 
ing the sample period and those that were not. For the entire sample, 
the announcement day AR associated with 81 acquisitions that subse¬ 
quently were divested is - 1.26 percent and is significant at the .01 
level. The CAR for the corresponding four windows ranges from 
-5.59 percent ([-20, 40]) to - 1.53 percent ([-5, 1]), and all these 
estimates are significant at the .01 level. The announcement day AR 
for the 320 acquisitions that were not subsequently divested is 0.05 
percent and is not statistically significant. The corresponding CARs 
for announcements of acquisitions that were not divested range Trom 
0.35 ([—1, 1]) to 2.13 ([-20, 40]), and all these estimates are 
significant at the .10 level or higher. 

The findings of negative and significant abnormal returns associ¬ 
ated with acquisitions that subsequently are divested during the sam¬ 
ple period and positive and significant abnormal returns associated 
with acquisitions that are not subsequently divested during the sample 
period deliver a strong message of market efficiency. On average, the 
market is able to immediately provide an unbiased forecast of the 
likelihood that the assets will ultimately be divested, long before any 
cash flows from the resulting business combination are known. Table 
6 displays the difference in abnormal returns for these two groups; 
they range from 1.31 percent for the announcement day to 7.72 


17 Both textile and friendly target* exhibit significantly higher divestiture rates than 
nontargets. The z-statistic is 5.03 for the hostile target and nontarget comparison, and 
4.23 for the friendly target and nontarget comparison. 



TABLE 5 


Abnormal Stock Market Performance Associated with Firms Announcing 
Acquisitions during 1982-86 That Are Subsequently Divested versus 
Acquisitions during 1982-86 That Are Not Subsequently Divested 





Event Window 



Category 

[0] 

[-1, 1] 

[-5, 1] 

[-5.40] 

[-20, 40] 


A. 

Acquisitions That Are Subseq 

uently Divested 

Entire sample 
(N = 81) 

-1.26*** 

(-6.15) 

35.80 

- 1.75*** 
(-4.93) 
30.86 

- 1.53*** 
(-2.81) ( 
38.27 

-4.01*** 

- 2.88) 
35.80 

- 5.59*** 
(-3.48) 
32.10 

Nontargets 
(N = 21) 

-1.16*** 

(-2.86) 

42.86 

- 1.66** 
(-2.30) 
28.57 

-.57 

(-.53) 

47.62 

2.55 

(.92) 

61.90 

2.48 

(.78) 

57.14 

All targets 
(N = 46) 

- 1.45*** 
(-5.58) 
30.44 

1.56*** 

(-3.46) 

30.44 

-2.07*** 
(-3.01) ( 

34.78 

- 7.04*** 
-3.99) 
23.91 

-8.91*** 

(-4.38) 

21.74 

Hostile targets 
(N - 28) 

-2.01*** 

(-7.13) 

28.57 

-2.59*** 

(-5.30) 

21.43 

- 1.84** 
(-2.46) ( 

32.14 

- 4.96*** 

-2.59) 

28.57 

-6.35*** 

(-2.88) 

28.57 

Friendly targets 
(N = 18) 

-.58 

(-1.19) 

33.33 

.04 

(05) 

44.44 

- 2.44* 

(-1.89) ( 

38.89 

-10.27*** 
-3,09) 
16.67 

- 12.90*** 
(-3.37) 

11.11 

Miscellaneous 
(V = 12) 

-.75 

(-1.16) 

50.00 

-2.38** 

(-2.11) 

41.67 

-.45 

(-.26) 

41.67 

-3.21 

(-73) 

33.33 

-5.91 

(-1.16) 

25.00 


B. Acquisitions That Are Not Subsequently Divested 

Entire sample 
(N = 320)' 

.05 

(.50) 

44.06 

.35* 

(1 90) 
50.63 

.56** 

(1.99) 

53.44 

1.89*** 

(2.63) 

58.75 

2.13** 

(2.57) 

58.44 

Nontargets 
(N = 211) 

.21 

(1.59) 

45.02 

.70*** 

(3.07) 

50.71 

.96*** 

(2.78) 

56.87 

3.40*** 

(3.80) 

63.03 

3.58*** 

(3.47) 

62.56 

All targets 
(N = 67) 

-.32 

(-1.47) 

46.27 

-.50 

(-1.33) 

46.27 

-.72 

(-1.26) 

40.30 

-.87 

(-59) 

47.78 

.28 

(17) 

50.75 

Hostile targets 
(N= 42) 

-.25 

(-.94) 

45.24 

-.77* 

(-1.70) 

47.62 

-1.00 

(-1.45) ( 

38.10 

- 2.31 

-1.31) 

45.24 

-1.08 

(-.53) 

47.62 

Friendly targets 
(N = 25) 

-.45 

(-1.24) 

48.00 

-.05 

(-08) 

44.00 

-.25 

(-.27) 

44.00 

1.56 

(.64) 

52.00 

2.56 

(.91) 

56.00 

Miscellaneous 
(N » 44) 

-.18 

(-.56) 

34.09 

-.23 

(-.40) 

54.55 

.30 

(.34) 

54.55 

-1.58 

(-.71) 

54.55 

-2.63 

(- 102 ) 

50.00 


Note.—MU tutici are m parcnthem and percentage abnormal returns that are positive are listed below i- 
statistics. 

* Significant at the 10 percent level. 

•* Significant at the 5 percent level. 

*•* Significant at the 1 percent level 
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TABLE 6 


Differences in Abnormal Returns Associated with Announcements of 
Acquisitions, 1982-86 


Comparison of 
Abnormal Returns 



Event Window 


[0] 

[-1. 1) 

[-5. 11 

[-5, 40] 

[-20,40] 

Divestitures vs. 

1.31*** 

2.10*** 

2.09*** 

5.90*** 

7.72*** 

nondivestitures 

(5.75) 

(5.25) 

(3.41) 

(3.77) 

(4.27) 

All target divestitures 

1.66*** 

2.26*** 

3.03*** 

10.44*** 

12.49*** 

vs. nontarget 
nondivestitures 

(5.69) 

(4.47) 

(3.94) 

(5.28) 

(5.48) 

Hostile target dives- 

2.22*** 

3.29*** 

2.80*** 

8.36*** 

9.93*** 

titures vs. nontarget 
nondivestitures 

(7.13) 

(6.10) 

(3.40) 

(3.96) 

(4.08) 

Friendly target dives- 

.79 

.66 

3.40** 

13.67*** 

16.48*** 

titures vs. nontarget 
nondivestitures 

(1.56) 

(.79) 

(2.54) 

(3.97) 

(4.16) 


Note.— z-statiHics are in parentheses 
** Significant at the 5 percent level. 
•** Significant at the 1 percent level, 


percent for the [-20, 40] window, and they are all statistically signi¬ 
ficant at the .01 level. 

The results from the sample of 46 target firm acquisitions that are 
subsequently divested support the argument that the motive behind 
many takeovers is to undo inefficient acquisitions previously made by 
targets. The announcement day AR associated with these acquisitions 
is - 1.45 percent and is significant at the .01 level. The corresponding 
CARs range from - 8.91 percent ([ - 20, 40]) to - 1.56 percent ([ - 1, 
1]), and all these estimates are significant at the .01 level. In contrast, 
the announcement day AR associated with 67 acquisitions made by 
targets that are not divested is -0.32 percent and is not statistically 
significant. The corresponding CARs range from —0.87 percent 
([ - 1, 1]) to 0.28 percent ([-20, 40]), and none of these estimates is 
statistically significant. 

Significant differences in CARs associated with acquisitions that are 
and are not subsequently divested exist for both friendly and hostile 
targets, although the relatively small size of the subsamples within 
each of these groups should be noted. In short, the data reveal that 
the average negative stock price effect associated with acquisitions 
made by targets is driven almost exclusively by the subset of acquisi¬ 
tions that subsequently are divested either in bust-up takeovers or 
during or following an unsuccessful takeover attempt. This evidence 
suggests that this stock price effect reflects more than overpayment by 
the target firms in thejr acquisitions. 

The results from the nontarget and miscellaneous categories are 
also of interest. The announcement day AR associated with 21 acqui¬ 
sitions made by nontargets that subsequently are voluntarily divested 
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is -1.16 percent and is significant at the .01 level. The corresponding 
CARs range from - 1.66 percent ([- 1,1]) to 2.55 percent ([ - 5,40]), 
and only the negative CAR associated with the [-1, 1] window is 
significant (.05 level). In contrast, the announcement day AR associ¬ 
ated with 211 nontarget acquisitions that are not divested is 0.21 
percent, though it is not significant. The corresponding CARs range 
from 0.70 percent ([- 1, 1]) to 3.58 percent ([-20, 40]), and all these 
estimates are significant at the .01 level. On average, these results 
indicate that nontarget firms voluntarily divest relatively less 
profitable acquisitions. Similar results hold for the miscellaneous cate¬ 
gory: the abnormal returns associated with acquisitions subsequendy 
divested are more negative than the acquisitions that are not divested. 

We noted earlier that while the divestiture rate is significantly 
higher for targets (40.7 percent, 46/113) than for nontargets (9.1 
percent, 21/232), there are only two divestitures by target firms prior 
to a takeover attempt for themselves. Thus the voluntary divestiture 
rate for targets (1.8 percent, 2/113) is actually considerably lower than 
the divestiture rate for nontargets (z-statistic = 3.23). Given the re¬ 
sults reported in tables 3-5, this finding suggests that those non¬ 
targets that divested acquisitions may have avoided takeover attempts 
by divesting less profitable acquisitions, whereas had the target firms 
divested their bad acquisitions, a takeover attempt might not have 
resulted. 

In conjunction with this reasoning, table 6 shows the difference in 
abnormal returns associated with the following paired subsamples of 
acquisitions: the 211 acquisitions made by nontargets that subse¬ 
quently are not divested and (a) the 46 acquisitions made by targets 
that subsequently are divested, ( b) the 28 acquisitions made by hostile 
targets that subsequently are divested, and (c) the 18 acquisitions 
made by friendly targets that subsequently are divested. These data 
reveal that, with the exception of the abnormal returns estimated over 
the two shortest windows for the last pairing, the differences in the 
abnormal returns across all paired subsamples are highly significant. 
These results indicate that the stock market provides a much differ¬ 
ent evaluation of acquisitions by nontargets that are retained versus 
acquisitions by target firms that are retained until a takeover attempt 
results in a divestiture either by a disciplining acquirer or through 
restructuring efforts to thwart off the takeover. 

D. Do Value-reducing Acquisitions Increase the 
Likelihood of Becoming a Takeover Target ? 

To examine the effect that value-reducing acquisitions have on the 
probability of becoming a takeover target, we estimate three sets of 
logit equations, which differ by the dependent variable and the sam- 
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pie used for the estimate. The dependent variables and the samples 
corresponding to the three sets of equations are (1) the logistic trans¬ 
formation of the probability that a firm is a target, hostile or friendly, 
for the sample of targets and nontargets; (2) the logistic transforma¬ 
tion of the probability that a firm is a hostile target, for the sample of 
hostile targets and nontargets; and (3) the logistic transformation 
of the probability that a firm is a friendly target, for the sample of 
friendly targets and nontargets. Within each set, five equations are 
estimated in which one of the independent variables is the sum of the 
abnormal returns associated with each firm’s acquisitions. For ex¬ 
ample, 77 firms made 113 acquisitions in the target category; thus we 
have 77 target firm acquisition observations. These five equations 
differ only by the five windows over which the abnormal returns are 
estimated. We anticipate an inverse relationship between this inde¬ 
pendent variable and the likelihood of being a target, especially a 
hostile target. 

Two other variables, the logarithm of the market value of the firms' 
equity (SIZE) and the percentage of equity held by the firms’ man¬ 
agers (MGTHOLD), 18 both computed as of the end of 1981, are 
included as regressors in the logit equations. 18 Palepu (1986) finds an 
inverse relationship between firm size and the likelihood of becoming 
a target during 1971-79. Since our sample period is 1982-88, a pe¬ 
riod that witnessed the advent of hostile takeovers for large corpora¬ 
tions, we expect a weaker and perhaps insignificant relationship be¬ 
tween SIZE and the likelihood of becoming a target. 

The expected sign of the estimated coefficient on MGTHOLD is 
more ambiguous. First, MGTHOLD proxies for the extent to which 
managers own sufficient shares to defeat takeover attempts; in this 
respect, we expect a negative estimated coefficient on MGTHOLD. In 
addition, if equity ownership by management and corporate take- 


18 The management ownership data come from proxy filings. Data were not available 
for Royal Dutch Petroleum, an acquiring firm in the nontarget group. 

19 Although the partial correlation coefficient between SIZE and MGTHOLD is 
negative (- .329) and significant (p = .0001), we are not concerned with the simultane¬ 
ous inclusion of these variables in the logit equations since we are primarily interested 
in the estimated coefficient on the abnormal returns. Potentially more troubling is a 
statistically significant, partial correlation coefficient between each of the abnormal 
rwtbnis and MGTHOLD (.159 (p = .013],. 159 [p = .013),. 125 [p = .053], .130 [p = 
,043]* and .104 [p * .108] for the abnormal returns computed over the [0], [- 1, 1]. 
[-5, 1], [-5, 20], and [-20, 40] windows, respectively). This direct correlation be¬ 
tween the returns to acquiring firms and MGTHOLD is consistent with Lewellen. 
Loderer, and Rosenfeld (1985) and You et al. (1986) and suggests that managers are 
less likely to make value-reducing acquisitions when they bear a larger proportion ol 
the wealth consequences olftheir decisions. Although this significant correlation might 
suggest the use of a recursive system, the magnitude of the correlation seems low 
enough to allow us to draw meaningful inferences from a single-equation logit model. 
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overs are alternative means of mitigating manager-stockholder con¬ 
flicts, we expect firms with low managerial holdings to receive more 
takeover bids; this also leads us to expect a negative estimated coef¬ 
ficient on MGTHOLD. However, MGTHOLD may have a counter¬ 
acting positive effect on the likelihood of being a target: where man¬ 
agers own a large percentage of equity and thus bear a relatively 
larger proportion of the wealth consequences associated with their 
decisions, they may have stronger incentives to seek out bids, and they 
may have less incentive to resist bids and their accompanying pre¬ 
miums. 20 Hence, the relationship between MGTHOLD and the likeli¬ 
hood of being a target is ambiguous a priori. The relationship be¬ 
tween MGTHOLD and the likelihood of being a hostile target seems 
unambiguous since the argument in favor of a positive coefficient 
estimate on MGTHOLD holds only for friendly bids. Hence, we ex¬ 
pect an inverse relationship between MGTHOLD and the likelihood 
of being a hostile target. 21 

The results displayed in table 7 show that the likelihood of being a 
takeover target, either friendly or hostile, is significantly and inversely 
related to the abnormal stock price performance with the firm’s acqui¬ 
sitions. The estimated coefficient has a negative sign in all the equa¬ 
tions, and all these coefficients are significant at the .05 level or 
higher. Both SIZE and MGTHOLD enter with negative estimated 
coefficients in all five equations, but none of these estimates is 
significant. 

The abnormal returns associated with the firms’ acquisitions also 
enter with negative, significant coefficients in the set of equations in 
which the dependent variable is the transformed probability that the 
firm is a hostile target. In four of the five equations, the estimated 


20 Although MGTHOLD may be a good approximation of the extent to which man¬ 
agers directly bear the wealth consequences of their decisions, other measures may be 
more appropriate. Specifically, the proportion of a manager's wealth that consists of 
equity in his firm might be a better proxy for the extent to which his interests are 
directly aligned with stockholders. However, this measure also has its drawbacks. Theo¬ 
retically, if a manager's entire wealth consists of equity in his firm, he may be more risk 
averse than other, diversified stockholders. In order to diversify his wealth while main¬ 
taining his equity ownership in the firm, he may diversify the firm's activities in ways 
that do not necessarily maximize stockholder value. Hence, it is not obvious that as this 
proportion increases, managers’ interests are more directly aligned with the interests of 
other stockholders. As a practical matter, this variable cannot be measured directly 
since the personal wealth of managers is not publicly available information. Since our 
principal interest in this paper lies elsewhere, we did not attempt an approximation of 
this variable (e.g., the ratio of the value of a manager’s stockholdings to the value of his 
salary). 

21 Walkling and Long (1984) find that the probability that a takeover attempt is 
hostile, as opposed to friendly, is inversely related to the percentage of equity owned by 
managers. Morck, Shleifer, and Vishny (1988) find that management of hostile targets 
owns significantly less equity than management of friendly targets. 
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B. Dependent Variable Is Transformed Probability That Firm 
Became Hostile Target (vs. Nontarget) (N = 213) 

Intercept .640 .595 .663 1.195 .931 

(.13) (.11) ( 13) (.39) (.25) 
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coefficient on the abnormal returns is negative and significant at the 
.05 level or higher; in the equation containing the announcement day 
AR, this estimate is negative and significant at the .10 level. As antici¬ 
pated, MGTHOLD enters with a negative and significant (at the .05 
level) estimated coefficient in all these equations. However, SIZE does 
not enter any of these equations with a significant estimated coef¬ 
ficient. Weaker results hold when the dependent variable is the trans¬ 
formed probability that the firm is a friendly target. The estimated 
coefficient on the abnormal returns is negative in all five equations, 
but it is not significant in the equations containing the announcement 
day AR and the CAR computed over the [ - 1, 1] window. The vari¬ 
able MGTHOLD enters each equation with a positive but insignificant 
coefficient estimate, and SIZE enters each of these equations with a 
negative but insignificant coefficient estimate. 

In summary, the logit equations show that the results presented in 
tables 3 and 4 remain robust after firm size and management own¬ 
ership are controlled for. Firms that make bad acquisitions are more 
likely to receive a takeover offer than firms that make good acquisi¬ 
tions. 


IV. Conclusion 

The evidence in this paper is consistent with the argument, developed 
originally by Marris (1963) and Manne (1965), that one motive for 
corporate takeovers is to discipline managers who operate their firms 
in ways that do not maximize profits. It is also consistent with a more 
specific argument, developed by Jensen (1986), that many takeqyers 
discipline managers who use free cash flow to make value-reducing 
acquisitions. 

The evidence is also relevant for arguments made by critics of 
hostile takeovers. First, although critics often lament the advent of 
hostile “bust-up” takeovers (i.e., takeovers that are followed by large 
divestitures of the target firms’ assets), this paper supports the argu¬ 
ment that hostile bust-up takeovers often promote economic effi¬ 
ciency by reallocating the targets’ assets to higher-valued uses. Sec¬ 
ond, these results cast new light on evidence concerning the effect of 
takeovers on the equity value of acquiring firms. Critics of hostile 
takeovers often argue that although target shareholders fare well 
in takeovers, these transactions frequently diminish the equity value 
of acquiring firms. Our evidence suggests that takeovers can be both a 
“problem” and a "solution.” Although, in the aggregate, we find that 
the returns to acquiring firms are approximately zero, the aggregate 
data obscure the fact that the market discriminates between “bad” 
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bidders, which are more likely to become takeover targets, and 1 
“good” bidders, which are less likely to become targets. 
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Classical stable population theory, the standard model of population 
age structure and growth, is ill suited to addressing many issues that 
concern economists and demographers because it is a “one-sex” the¬ 
ory. This paper investigates the existence, uniqueness, and dynamic 
stability of equilibrium in the birth matrix-mating rule (BMMR) 
model, a new model of age structure and growth for two-sex, mo- 
nogamously mating, populations. The paper shows, by means of 
examples, that the BMMR model can have multiple nontrivial 
equilibria and establishes sufficient conditions for uniqueness. It 
generalizes a theorem of W. Brian Arthur to nonlinear systems and 
uses it to establish sufficient conditions for local dynamic stability. 


The relevance of the economics of the family to mainstream economic 
concerns is now well established. Examples are numerous. In addition 
to old favorites such as labor force participation, investment in human 
capital, and the intergenerational transmission of wealth, they include 
discussions of saving behavior (see Kotlikoff 1988; Modigliani 1988) 
and the burgeoning literature on “Ricardian equivalence” (Barro 
1974; Bernheim 1987; Feldstein 1988). Gary Becker (1988), in his 
1987 presidential address to the American Economic Association, 
argues the importance of family economics for understanding such 
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macro issues as cyclical fluctuations, economic development, and eco¬ 
nomic growth. 

As the family moves toward the center of the economic stage, de¬ 
mography cannot be far behind. Yet “classical stable population the¬ 
ory,” the standard demographic model of population age structure 
and growth, is ill suited to addressing these and many other issues 
that concern demographers and economists. 

The fundamental problem is that classical stable population theory 
is a “one-sex” theory: only females matter. Its two building blocks are 
an age-specific fertility schedule and an age-specific mortality sched¬ 
ule for the female population. Calculating these schedules requires 
very little data: it suffices to observe for a single period the ages at 
which females give birth and the ages at which females die. Classical 
stable population theory imbues these observed, age-specific vital 
rates with significance by assuming that they remain constant over 
time. This assumption allows classical stable population theorists to 
calculate a population’s equilibrium age structure and growth rate 
and to predict its evolution over time. Demography’s two-sex problem 
is to generalize classical stable population theory to monogamously 
mating, age-structured populations. 1 

In the terminology of Thomas Kuhn (1970), demographers have 
generally regarded the two-sex problem as a puzzle rather than as a 
fundamental anomaly whose resolution might require recasting the 
paradigm. Even viewed as a puzzle, demography’s two-sex problem 
cannot be solved by introducing constant, age-specific fertility and 
mortality schedules for males as well as for females. The male fertility 
and mortality schedules play no role in classical stable population 
theory, and introducing them in this way yields two incompatible one- 
sex models: the “female dominance” model based on the female mor¬ 
tality and maternity schedules and the “male dominance” model 
based on the male mortality and paternity schedules. The incompati- 

1 The two-sex problem has a long history in demography. Alfred Lotka, the founder 
of classical stable population theory, discussed it in 1922. Keyfitz (1968), Coale (1972), 
Pollard (1973), and Charlesworth (1980) provide surveys of the classical theory. Good¬ 
man (1953), Fredrickson (1971), Keyfitz (1971), Yellin and Samuelson (1977), Das 
Gupta (1978), Schoen (1981), Caswell and Weeks (1986), and Poliak (1986, 1987*, 
1990) discuss the two-sex problem; Das Gupta, Schoen, and Caswell and Weeks provide 
further reference to the literature. In an unpublished doctoral dissertation, Feeney 
(1972) presents a two-sex model with the same basic structure as mine, but without a 
satisfactory proof of the existence of equilibrium. Biologists recognize two other two- 
sex problems. The most fundamental one is why some species, including our own. 
reproduce sexually (see Maynard Smith 1978; Bernstein et al. 1985). The second takes 
sexual reproduction as given and seeks to explain why the sex ratio for a species or a 
population assumes a particular value; R. A. Fisher developed the classical theory of 
sex ratio determination; for modern views see Maynard Smith (1980), Charnov (1982), 
and Samuelson (1985). Demographers take both sexual reproduction and the sex ratio 
of newborns as given. 
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bility of the two mirror-image models becomes evident when the in-' 
trinsic or equilibrium growth rates implied by the female dominance 
and the male dominance models differ. Unless these two implied 
growth rates happen to coincide, the implied sex ratio of the popula¬ 
tion approaches zero or infinity. 

According to Coale (1972, p. 56), “The greatest imbalance of the 
sexes found in a search of recent Demographic Year Books of the 
U.N. was in West Berlin in 1950, where r m = -0.0001, and ry = 
- .0115.” Pollard (1973) begins his chapter on the two-sex problem by 
citing Kuczynski’s calculation of male and female net reproduction 
rates for France in the years immediately following World War I. 
Using the female model, Kuczynski found that the average number of 
daughters that would be born to a female then aged 0 was 0.977; 
using the male model, he found that the average number of sons that 
would be born to a male then aged 0 was 1.194. Thus the “use of a 
one-sex model with the female component of the population would 
predict a continually decreasing population for France whilst the 
same model applied to the male component would predict a continu¬ 
ally increasing population" (Pollard 1973, p. 82). 

Demography’s two-sex problem is a fundamental anomaly that can 
be resolved only by replacing classical stable population theory with a 
model that recognizes that the observed rates for both females and 
males are in disequilibrium. In Poliak (1986) I propose a model of 
monogamously mating, age-structured population, the birth matrix¬ 
mating rule (BMMR) model. In the BMMR model the fertility of a 
representative female of a particular age is not a constant but a func¬ 
tion whose value depends on the population’s age-sex composition. 
There are two reasons for this dependence. First, the probability that 
a female of age i will find a mate depends on the number of females in 
each age category and the number of males in each age category. 
Second, the number of offspring produced by a mated female may 
depend not only on her age but also on the age of her mate (see 
Goldman and Montgomery, in press). Thus the BMMR model avoids 
the contradictions of classical stable population theory by allowing 
fertility rates to adjust to the population’s age-sex structure. The 
BMMR model, unlike classical stable population theory, provides a 
theoretical framework capable of analyzing the effects on marriage 
patterns of a “marriage squeeze.” 2 A marriage squeeze can arise when 
a population initially in equilibrium—that is, a population maintain¬ 
ing an unchanging age structure and growing at a constant rate—is 
disturbed by a sudden change in the birth rate. For example, suppose 

8 Schoen (1983) and Goldman, Westoff, and Hammerslough (1984) provide discus¬ 
sions of the marriage squeeze and references to the literature. ** 
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that females generally marry older males and that an equilibrium is 
disturbed by a baby boom. Consider what happens when young fe¬ 
males from the leading edge of the baby boom cohort enter the mar¬ 
riage market. These females find that, compared with the situation 
faced by their older sisters when they entered the marriage market, 
there is a surplus of young females relative to appropriately older 
males. A mating rule or marriage function—a function mapping the 
female and male populations by age into unions identified by the ages 
of both partners—is the appropriate construct for analyzing the ad¬ 
justments in marriage patterns induced by a marriage squeeze. 

This paper begins by briefly describing classical stable population 
theory. Section II describes the BMMR model and sketches a proof of 
the existence of a nontrivial equilibrium. Section III establishes a 
sufficient condition for local dynamic stability and shows that this 
condition is satisfied in four demographically interesting cases. It then 
establishes a sufficient condition for uniqueness of equilibrium, shows 
that it is satisfied in demographically interesting cases, and demon¬ 
strates by example that the BMMR model can have multiple nontriv¬ 
ial equilibria. Section IV is a brief conclusion. 


I. Classical Stable Population Theory 

The mathematics of classical stable population theory is straight¬ 
forward. Let F denote the female population vector by age, F = 
(F i,..., F n ), where n is the greatest age that any individual can attain. 
Let d denote the female mortality schedule, d = (d \,. .., d n ), where 
d„ = 1, and b the female fertility (maternity) schedule, b = (by ,. . . , 
b„), where b, is the number of female offspring born to a female of 
age i. 

These two schedules define a mapping or projection of the female 
population in period t into the female population in period t + 1: the 
age-specific fertility rates determine the number of newborns in pe¬ 
riod t + 1, and the mortality schedule determines the number in each 
of the other age categories. Applying the age-specific fertility sched¬ 
ule to the number of females of each age in period t determines the 
number of newborns in period t + 1: 

F \ +1 = J b,F‘. (1) 

I* 1 

Applying the age-specific mortality schedule to the number of fe¬ 
males of each age in period t determines the number in the successor 
category in period t + k 

F | +1 = (1 - 4 - 


( 2 ) 
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Matrix notation allows us to express this compactly as 

¥ l+l = LF*, (3) 

where the n x n projection matrix L, 

b 1 b 2 ... b„~\ b n 

(1 - di) 0 ... 0 0 

L = • , (4) 

0 0 ... (l-rf„_i) 0. 

is called the “Leslie” matrix. 


Linear algebra provides powerful tools for investigating the exis¬ 
tence of equilibrium and dynamic stability in the classical model. An 
equilibrium is defined as an age distribution, £, and an equilibrium or 
intrinsic growth rate, f, which satisfy the matrix equation 

(1 + f)t = Lt. (5) 

Thus the equilibrium age distribution is an eigenvector of the Leslie 
matrix and 1 + f is the corresponding eigenvalue. Standard demo¬ 
graphic terminology calls an age structure that reproduces itself up to 
a scale factor a “stable age distribution” rather than an “equilibrium 
age distribution.” I have departed from the standard terminology 
because it blurs the distinction between the existence problem and the 
dynatqic stability problem. 3 

A mathematical problem that arises in the one-sex model when the 
vector F = 0 requires special attention because it foreshadows more 
serious problems in the two-sex model. In mathematical usage, it is 
conventional to say that an equilibrium is a nonzero vector P that 
satisfies (5). From a demographic standpoint it is preferable to call 
F = 0, as well as any other vector F that maps into zero, a “trivial” 
equilibrium. In the one-sex model, trivial equilibria are uninteresting 
both demographically and mathematically. In the two-sex model, triv¬ 
ial equilibria are the major obstacles blocking the “natural” fixed- 
point proof of the existence of a nontrivial equilibrium. In the one- 
sex model, if females beyond a certain age do not reproduce, then 
population vectors consisting entirely of such females will, after a 
finite number of periods, map into the zero vector. In the one-sex 
model, it is often convenient to drop such nonreproductive females 


s Ecologists may find my terminology misleading because the model contains no 
concept of an equilibrium population size. 
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from the model, analyze the reduced model, and reinterpret the re¬ 
sults in terms of the original model. In the two-sex model we cannot 
drop nonreproductive individuals because they may mate with poten¬ 
tially reproductive individuals. 

II. The Birth Matrix-Mating Rule (BMMR) 

Model 

The BMMR model has three essential building blocks: a birth matrix, 
a mating rule, and the female and male mortality schedules. Thus the 
fertility of a female of age i is not specified directly but is derived from 
the underlying birth matrix and the mating rule and depends on the 
population’s age-sex composition. An element of the birth matrix 
such as by represents the expected number of female offspring born 
in a period to an “(i, j) union,” that is, the union of a female of age i 
with a male of ag ej. More precisely, each (i,j) union formed in period 
t produces b tJ female offspring and <jb l} male offspring who appear as 
newborns in period t + 1. The parameter a denotes the secondary 
sex ratio—the ratio of male to female newborns—and I assume that it 
is a constant, independent of the population's age-sex structure and 
independent of the ages of the parents. I denote the birth matrix by 
B = {b y }, the number of (t, j) unions by u tJ , and the corresponding 
unions matrix by U. 

The mating rule shows the number of unions of each type—that is, 
identified by age of female and age of male—as a function of the 
number of individuals in each age-sex category. A mating rule is thus 
a mapping, pfF, M), of the population vector (F, M) into the matrix of 
unions: U = p(F, M); it is often convenient to write u v = p y (F, M), 
where |x^(F, M) denotes the function mapping (F, M) into the variable 
«ij . Any model of a monogamously mating, age-structured population 
requires an assumption about the durability of unions. In the BMMR 
model I assume that unions last for a single period. The advantage of 
this “southern California” assumption of serial monogamy is that it 
exposes the model’s logical structure, simplifying substantially both 
the notation and the analysis. The assumption that matings persist for 
one period means that the length of the time period plays a double 
role in the model, as Parlett (1972) points out. Poliak (19876) analyzes 
th4f;BMMR model with “persistent unions.” 

The dynamics bf the BMMR model are straightforward to describe. 

a) The initial population vector, together with the mating rule, 
determines the number of unions of each type; the number of unions 
of each type, together with the birth matrix, determines the number 
of newborns in the next period. The number of newborn females is 
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given by the “newborns” function, 4» 1 (P r . M'). and newborn males by 
M'): 


F{ + 1 = 6‘(F\ M‘) = X X M*), 

• i 

(6a) 

M\ +l = ct^F*. MO. 


Thus the number of female offspring born to an average female of 
age i, B‘(F, M)—the analogue of the parameter i>, of classical stable 
population theory—is a function of the population vector (F, M) and 
is equal to a weighted sum (across male age classes) of the birth matrix 
parameters {b, u ... , b,„}: 

X M) 

B‘( F, M) = —L ---. (7) 

c, 

b ) The initial population vector, together with the mortality sched¬ 
ules, determines the number of individuals in each of the other age- 
sex categories, just as in classical stable population theory: 

F‘, + l = 4> l/ (F < , M') = (1 - rff-OF'-,, 

(6b) 


M‘ +l = <^ M (r, MO « (1 - df_i)Afj_,. 


Thus the BMMR model defines a transformation that maps the 
population in period t into the population in period t + 1: 


(r + \ M' + 1 ) = (MF*. MO. 


( 8 ) 


Formally, an equilibrium is defined as an age distribution and a 
growth rate, (l\ M, t), that satisfy the equation 

[(1 + t)t, (1 + f)M] = 4>(f,M). (9) 


When the equilibrium female and male fertility rates of the BMMR 
model are used to construct the female dominance model and the 
mirror-image male dominance model, the two one-sex models are 
consistent with each other in the sense that they imply identical 
growth rates for the female and male populations. Furthermore, the 
equilibrium age structure and growth rates corresponding to these 
two one-sex models are identical to the equilibrium age structure and 
growth rates of the BMMR model. Away from equilibrium, however, 
these two one-sex models and the BMMR model generate different 
predictions. 
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Establishing the existence of equilibrium and analyzing dynamic 
behavior in the BMMR model is more complex than in classical stable 
population theory because the properties of the mating rule make the 
BMMR model inherently nonlinear. Any mating rule for monoga¬ 
mous unions must satisfy two accounting requirements: a nonnegativ¬ 
ity condition ensuring that the number of unions of each type is posi¬ 
tive or zero and an adding-up condition ensuring that the number of 
mated females of each age does not exceed the total number of fe¬ 
males of that age, with a similar requirement holding for males. This 
adding-up requirement is the fundamental source of nonlinearity in 
the BMMR model. 

Despite its nonlinearity, we can establish the existence of equilib¬ 
rium in the BMMR model under suitable assumptions. In addition to 
the two accounting axioms, I impose three substantive axioms on the 
mating rule, (a) Universal scope: The mating rule must be defined for 
all nonnegative population vectors. ( b ) Continuity. The function p. 
must be continuous in (F, M). (c) Homogeneity: The function |x must 
be homogeneous of degree one in (F, M): that is, p.(XF, AM) = Ap/F, 
M) for all A > 0. The homogeneity axiom implies that a 1 percent 
increase in the number of individuals in every age-sex category results 
in a 1 percent increase in the number of unions of every type. Individ¬ 
uals’ searching for mates in a restricted and increasingly crowded 
region suggests a “density dependent” mating rule, but the 
homogeneity axiom precludes dependence of mating on population 
density. Because the elements of the birth matrix and the mortality 
schedules are constants, homogeneity of the mating rule guarantees 
the homogeneity of the mapping <j>(F, M). In classical stable popula¬ 
tion theory, the corresponding mapping is linear as well as homoge¬ 
neous. Because we are trying to construct a two-sex model capable of 
maintaining an unchanging age structure while growing at a constant 
rate, homogeneity is an attractive assumption. These five axioms on 
the mating rule, together with a condition that I call “r-productivity,” 
are sufficient to ensure the existence of equilibrium in the BMMR 
model- 

An equilibrium of the BMMR model is a fixed point of the map¬ 
ping <J>(F, M) or, more precisely, a fixed point of a related mapping in 
which the population vector is suitably normalized. A “natural” proof 
strategy would attempt to apply a fixed-point theorem to this map- 
pi^. The difficulty with this strategy is that the mapping carries some 
potmts in the domain into (0, 0). Although (0, 0) and any initial popu¬ 
lation vectors (F°, M°) that map into (0,0) satisfy equation (9) and thus 
are equilibria, this strategy fails to establish the existence of a nontriv¬ 
ial equilibrium. An alternative proof strategy avoids this difficulty by 
drastically limiting the domain of the mapping, reducing the problem 
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to one dimension. A threshold observation is that the only nonlinear 
component of the BMMR model is the mating rule. Hence, the only 
nonlinear component of the mapping <j>(F, M) is the newborns func¬ 
tion <t>‘(F, M). To exploit this fact the existence proof decomposes the 
argument into two parts: the determination of an equilibrium age 
structure and the determination of an equilibrium growth rate. We 
begin by disconnecting the birth matrix and the mating rule from the 
mortality schedules and imagining that the number of newborns 
grows at a constant rate r. (For definiteness, suppose that in each 
period storks take away all newborns and bring replacements: in the 
first period they bring N newborn females and (tN males; in the 
second, [1 + r]N females and [1 + r]aN males; in the third, [1 + r] 2 jV 
females and [1 + r] 2 trJV males, and so on.) After n periods the age 
structure of the population is uniquely determined by the mortality 
schedules, the secondary sex ratio (a), the growth rate of newborns 
(r), and the initial number of newborn females (AT). For example, the 
age structure of the female population at time t is given by 


T — 1 

F‘ = F}(1 + r)‘- T f[ 0 " 


( 10 ) 


k= 1 


where r < n and F{ - N. For n - 3 and no early mortality, this 
implies (F,, F 2 , F s ) = [(1 + rfN, (1 + r)N, N], In Poliak (1986) I call a 
population with this structure an r-equilibrium. If the model has a 
nontrivial equilibrium (t, M, f), then t and M are r-equilibrium popu¬ 
lations corresponding to f. A normalized r-equilibrium female popula¬ 
tion, F(r), is an r-equilibrium population in which the number of 
newborn females is one; the normalized r-equilibrium male popula¬ 
tion, M(r), is one in which the number of newborn males is a. Since 
the newborn females are a cohort of size one, the 1-year-olds are the 
survivors of a cohort of 1/(1 + r), the 2-year-oids the survivors of a 
cohort of 1/(1 + r) 2 , and so on: 


n 

A- I 


(ID 


For n = 3 and no early mortality, this implies F(r) = [1, 1/(1 + r), 
1/(1 + r) 2 ]. 

If the BMMR model has a nontrivial equilibrium (t, M, f), then the 
equilibrium age structure (f\ M) is an r-equilibrium population for 
r = t. Hence, when we search for a nontrivial equilibrium, it suffices 
to restrict our attention to r-equilibrium populations, thus reducing 
the problem to a single dimension: an equilibrium of the BMMR 
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model corresponds to a value of r for which the r-equilibrium popula¬ 
tion satisfies equation (9): 

[(1 + r)F(r), (1 + r)M(r)] = <|>[F(r), M(r)]. (12) 

By the definition of an r-equilibrium population, the number of indi¬ 
viduals in every age-sex category except newborns grows at the rate r. 
Hence, we need only find a value of r for which the birth matrix and 
the mating rule imply that the number of newborn females corre¬ 
sponding to a normalized r-equilibrium population is 1 4- r. In the 
example with no early mortality, this implies that the new female 
population vector is given by [(1 + r), 1, 1/(1 + r)]. Defining the 
function i|/(r) by 

*l»(r) = 4>*[F(r), M(r)] — 1, (13) 

we can express this equilibrium condition as 

4»(r) = r. (14) 

In terms of figure 1, establishing the existence of a nontrivial equi¬ 
librium requires showing that the function <ji(r) crosses the 45° line. 
Poliak (1986) proves that the BMMR model has a nontrivial equilib¬ 
rium provided that the model is r-productive. The r-productivity con¬ 
dition requires that there exist a value of r for which a normalized 
r-equilibrium population produces at least 1 + r newborn females. 
The purpose of the r-productivity condition is to ensure that there is 
some value of r for which the function i|>(r) lies above the 45° line. The 
fixed birth matrix implies that there is some value of r for which the 
function »|i lies below the 45° line. For any normalized r-equilibrium 
population, a larger value of r implies a smaller number of individuals 
in each age group except newborns; furthermore, as r approaches 
infinity, the normalized r-equilibrium female population vector 
approaches (1, 0,0,..,, 0) and the corresponding {pale vector (a, 0, 0, 
.... 0). For sufficiently large r, only (1,1) unions can form, and 
the adding-up condition implies that the number of such unions can¬ 
not exceed min{l, ct}. Thus for sufficiently large r, the number of 
newborn females must approach or be less than 6,, min{l, <r}. For 
sufficiently large r, this upper bound on the number of newborn 
females must be less than 1 + r. Hence, for sufficiently large r, the 
function i|i(r) lies below the 45° line. (In Poliak [1986] I assume that 
newborns do not enter unions, so that in the next period the number 
of newborns approaches zero and i|»(r) approaches negative one as r 
approaches infinity.) 

Since the function tj»(r) lies above the 45° line for some values of r 
and below it for others, continuity implies that it crosses the 45° line at 
least once. Hence, the BMMR model has a nontrivial equilibrium. 
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Fig. 1.—Existence of equilibrium in the BMMR model 


III. Dynamic Stability and Uniqueness 

Dynamic stability, unlike existence, cannot be reduced to one dimen¬ 
sion. In'the Appendix I extend the elegant argument of Arthur 
(1981, 1982) to show that an equilibrium of the BMMR model is 
locally stable if, in a neighborhood of that equilibrium, the number of 
newborns is a nondecreasing function of the number of individuals in 
every age category and a strictly increasing function of the number of 
individuals in at least two adjacent categories. 4 I show by example 
that, unless additional axioms are imposed, the BMMR model need 
not satisfy this condition. Four sets of demographically meaningful 
conditions imply local stability. 

a) Collective maximizing mating: Suppose that the mating rule is 
such that the configuration of unions formed are those that maximize 
the total number of newborns. With collective maximizing mating, an 
increase in the number of individuals in any age-sex category cannot 
decrease the number of newborns. Thus maximizing mating implies 

4 Caswell and Weeks (1986) use both analytic and simulation techniques to investigate 
dynamic stability in a related two-sex model. 
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that the newborns function is nondecreasing in all its arguments and 
stricdy increasing in some. To ensure stability in this case (and in the 
three remaining cases), we must also assume that the newborns func¬ 
tion is stricdy increasing for individuals of two adjacent ages. It might 
be thought that evolution would favor collective maximizing mating, 
but this is incorrect. Although collective maximizing mating favors 
the interest of the group, evolutionary arguments run not in terms of 
groups but in terms of individuals or genes (see Dawkins 1976). In the 
vocabulary of economics, natural selection provides no mechanism 
for internalizing externalities. 

b) Individual maximizing mating: Suppose that each individual 
seeks to maximize the number of his or her offspring subject to the 
constraint that all matings must be voluntary and monogamous. Sup¬ 
pose that the birth matrix contains no “ties,” except perhaps for 
unions that produce zero offspring. Then the equilibrium mating 
pattern can be found using a straightforward algorithm. First select 
the largest element in the birth matrix and form the maximal number 
of unions of that type (the maximal number of (i, j) unions is the 
minimum of the number of females of age i and the number of males 
of age j in the population). Now select the largest remaining element 
in the birth matrix and form the maximal number of unions of that 
type from the unmated population (i.e., the population remaining 
after eliminating those individuals mated at previous stages). Proceed 
in this way through the birth matrix or until the only remaining 
elements are zero. Individual maximizing mating need not corre¬ 
spond to collective maximizing mating, but local dynamic stability is 
ensured because an increase in the number of individuals in any age 
category cannot decrease the number of newborns. 

Proof. Suppose, for definiteness, that a female of age i 0 is added to 
the population. In the algorithm for individual maximizing mating, 
suppose, without loss of generality, that the new female is the last 
female of age to to be mated. Adding a female of age t 0 to the popula¬ 
tion has no effect on unions with higher fertility than the one at which 
such females become the binding constraint in the algorithm. If the 
new female mates with a previously unmated male, then the number 
of newborns increases at this stage and no lower-order union is dis¬ 
placed. If she mates with a previously mated male, then (1) the num¬ 
ber of offspring increases at this stage of the algorithm because the 
new union produces more offspring than the union it displaces and 
(2) a female of age i t is displaced. Thus if we introduce a female of age 
io and wididraw a female of age tj, the net effect is an increase in the 
number of newborns. But now we can proceed sequentially, rein¬ 
troducing the female of age ii and withdrawing a female of age i 2 , and 
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so on, at each step increasing (or not decreasing) the number of ( 
newborns. 

The following example illustrates the difference between collective 
and individual maximizing mating. Suppose that there are two age 
groups and b? 2 = 4, b ]2 = f> 2 i = 3, and b n = 0. With equal numbers 
in all age-sex categories, collective maximizing mating calls for mixed 
unions, while with individual maximizing mating those in the older 
age category would mate with each other. 

If there are ties between nonzero elements in the birth matrix, 
individual maximizing mating may fail to yield a unique set of equilib¬ 
rium mating patterns or a unique number of newborns. For example, 
suppose that there are two age groups and & 22 = £>21 = 4, f» l2 = 3, and 
b[ 1 = 0. Suppose that F 2 < M t + M 2 . To determine uniquely the 
number of newborns (a prerequisite to investigating whether new¬ 
borns are a nondecreasing function of the number of individuals in 
every age category), we need a de-breaking rule to determine which 
males mate with females of age 2 and which are left. Among the de- 
breaking rules guaranteeing that the newborns funcdon is nonde¬ 
creasing are random selection and priority by age (e.g., oldest first, 
youngest first). 

c) Zero spillover mating: Suppose that the mating rule is such that 
the number of unions involving females of age i and males of age) 
depends only on F, and My and is independent of the number of 
individuals in the other age-sex categories: 

u v = ji*(F, M) = | i\F it M,), for all i,j. (15) 

Provided that these functions are nondecreasing in F, and M p an 
increase in the number of individuals in any age-sex category cannot 
decrease the number of newborns, 

Schoen’s "harmonic mean” mating rule (Schoen 1981), 

H*(F, M) = a,j >0, X s 1 V i and £ oq, s 1 V), 

(16) 

is an example of a zero spillover mating rule involving only a single 
parameter for each type of union. 

The constant elasticity of substitudon (CES) mating rule, 

^(F, M) = [(«*r"F,-* + (a f> )-"My-*r 1/w , (17) 

where a,y > 0, a i]m > 0, and p,y > 0 for all ij, 2, a,yy § 1 for all i, and 2 , 
OLqm § 1 for all), is a more general zero spillover rule. The summation 
condidons and the requirement that p,y > 0 for all ij ensure that the 
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adding-up axiom is satisfied. When p y = 1 and = a i;m = ot y , for all 
i, j, the CES mating rule reduces to 

M.»(F, M) = (a.J l F ~ 1 + a- l M~ l )~ l , (18) 

which, after some manipulation, reduces to Schoen’s harmonic mean 
mating rule. 5 

d ) The IMEX model with males in surplus: Poliak (1986) defines 
the “identical males-exhaustive mating” (IMEX) model as a special 
case of the BMMR model. We begin by distinguishing between eligible 
and ineligible individuals and defining an exhaustive mating rule. Eligi¬ 
ble individuals are eligible to mate; ineligible individuals are noncan¬ 
didates. Those in the ineligible population are not paired with mates 
even when there are surplus members of the opposite sex and all 
eligible members of their own sex have mates. Thus the eligible popu¬ 
lation may exclude the sick, the very young, and the very old. It may 
also, however, exclude various fractions of the individuals in each 
age-sex category. A mating rule is said to be exhaustive if it never 
leaves both unmated females and unmated males in the eligible popu¬ 
lation. Unless the number of eligible females happens to equal the 
number of eligible males, an exhaustive mating rule does leave un¬ 
mated either some eligible females or some eligible males. An exhaus¬ 
tive mating rule, however, guarantees the formation of the maximum 
number of unions involving members of the eligible population. 

In the IMEX model all males in the eligible population are identical 
in the sense that the fertility of an (i, j) union is independent of the 
age of the male. Provided that males are in surplus in equilibrium, the 
IMEX model is essentially equivalent to the classical model, and in 
the neighborhood of such an equilibrium, an increase in the nulnbet 
of individuals in any age-sex category cannot decrease the number of 
newborns. The IMEX model with females in surplus, on the other 
hand, provides an example of a specification in which additional indi¬ 
viduals in some age-sex categories can reduce the number of new¬ 
borns. More specifically, additional low-fertility females can reduce 
average fertility per union without increasing the number of unions, 
thus reducing the number of newborns. 

My discussion of dynamics has focused on locsd father than global 
stability for three reasons. First, some initial population vectors must 
converge to the trivial equilibrium: consider, for example, an initial 
population vector with no females. Second, an example due to Brian 

5 Caswell and Weeks (1986) use the CES mating rule; Jere Behrman, Samuel Preston, 
and I are now estimating-the CES and other “marriage functions” using U.S. and 

Japanese data. ! 
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Arthur cited in Poliak (1986) shows that, instead of converging to an 
equilibrium, the BMMR model can oscillate; the example violates the 
condition that the number of newborns be strictly increasing for indi¬ 
viduals of at least two adjacent ages. Third, a model with multiple 
equilibria cannot be globally stable, and as the following example 
shows, the BMMR model can have multiple nontrivial equilibria. 

Specifying the example requires mortality schedules, a birth ma¬ 
trix, and a mating rule, (a) Mortality schedules: All individuals live 
two periods, (b) Birth matrix: Only unions in which both the female 
and the male are newborns are fertile, so b n is the only nonzero 
element of the birth matrix; let b u = b. (c) Mating rule: All individ¬ 
uals are in the eligible population. Females of age 2 have first priority 
in mating and prefer males of age 1. Females of age 1 have second 
priority in mating and also prefer males of age 1. 

Applying the mating rule to r-equilibrium populations, {3, 
1/(1 + r)}, we find that for r ^ 0, F 2 — M\; hence, for r £ 0, all 
potentially reproductive males are mated with females of age 2 and 
no newborns are produced; for r ^ 0, t|»(r) = — 1, For r > 0 we have 
Uu = 1 - [1/(1 + r)], and hence tp(r) = b - 1 - [b/( 1 + r)]. It is easily 
verified that this function increases monotonically and is asymptotic 
to b - 1. For b < 4, the function 4>(r) does not cross the 45° line (see 
fig. 2), and the model has no nontrivial equilibrium. For 6 = 4, the 
function i|<(r) is tangent to the 45° line at r = 1 and the model has a 
single nontrivial equilibrium. For b> 4 the function »|i(r) intersects the 
45° line twice and the model has two nontrivial equilibria. For ex¬ 
ample, for b = 9/2 the equilibria are f = 1/2 and t - 2. This example 
not only shows that multiple equilibria are possible in the BMMR 
model but also illustrates the crucial role of the r-productivity condi¬ 
tion in ensuring the existence of a nontrivial equilibrium. 

A sufficient condition for the BMMR model to have a unique non¬ 
trivial equilibrium is easy to obtain: it is clear from the geometry of 
figure 1 that if the function »J/(r) decreases monotonically, then it can 
cross the 45° line only once, and hence the BMMR model can have 
only one nontrivial equilibrium. It is plausible that the function t|t(r) 
could be downward sloping: the larger the value of r used to calculate 
the r-equilibrium, the smaller the number of individuals in each age- 
sex category except newborns; with fewer individuals in each age-sex 
category, one might expect fewer newborns in the next period. 
Nevertheless, the five axioms imposed on the mating rule and the r- 
productivity condition do not imply that the function 4*(r) is down¬ 
ward sloping, and the example just presented shows that the nontriv¬ 
ial equilibrium need not be unique. An overly strong sufficient 
condition for uniqueness is that at every population vector an increase 
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Fig. 2.—Cases of no equilibrium and multiple equilibria 


in the number of individuals in each age-sex category not decrease 
the number of newborns in the next period. 6 Thus the first three of 
the four demographically meaningful conditions for local dynamic 
stability—the two types of maximizing mating and zero spillover mat¬ 
ing—provide conditions for uniqueness; if any one of these local 
stability conditions holds globally, then the nontrivial equilibrium is 
unique. 7 

IV. Conclusion 

This paper has described the BMMR model, sketched a proof of 
the existence of equilibrium, established sufficient conditions for 
uniqueness and local dynamic stability, and shown by example that 
the model can have multiple nontrivial equilibria. In a model with 
multiple equilibria, initial conditions determine which, if any, of sev¬ 
eral equilibria will be realized, and small differences in initial condi¬ 
tions can cause large differences in long-run behavior. Thus 
uniqueness and dynamic stability are intimately related. Further 
work—both theoretical and empirical—is required to determine 
whether any of the sufficient conditions for uniqueness and dynamic 

* A weaker but still overly strong sufficient condition is that at every r-equilibrium 
population vector, an incMcse in the number of individuals in each age-sex category 
not decrease the number of newborns in the next period. 

7 The IMEX model with males in surplus cannot hold globally. 
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stability is satisfied and whether multipie equilibria qr dynamic insta¬ 
bility is a realisdc possibility in the BMMR model. ! ■ • 

Simulation is a possible approach to investigating these issues, but 
the BMMR model illustrates the difficulty of using stmulatioia tech¬ 
niques when a model lacks a parametric specification. Simulating the 
BMMR model requires us first to specify a functional form for the 
mating rule and then to specify the appropriate parameter values. An 
analogy with economics is useful. Classical stable population theory is 
like the input-output production model: both are so highly parame¬ 
terized that data from a single period are enough to identify all the 
model’s structural parameters and to allow us to predict its evolution. 
The BMMR model is like the neoclassical production model: both 
involve functions that are not specified parametrically, and the issue 
of spedfying “plausible” or “realistic” parameter values for the model 
does not arise until functional forms are specified. The power of the 
BMMR model, like that of the neoclassical production model, arises 
from its generality: in both cases, the model’s failure to specify a 
parametric functional form is not a weakness but a strength. 8 

Further work is required to transform the BMMR model from a 
merely formal into a substantive model of population age structure 
and growth. The transformation requires importing behavioral theo¬ 
ries from social science into the BMMR model to explain its three 
primitives: the birth matrix, the mating rule, and the mortality sched¬ 
ules. 9 From the standpoint of social science, however, these three 
primitives are different kinds of analytical constructs. The mortality 
schedule is often regarded as a biological datum, although there is 
ample precedent (from Malthus to recent concern about excess fe¬ 
male infant mortality rates in India) for regarding mortality as endog¬ 
enous. The elements of the birth matrix reflect the decisions of indi- 


8 Both classical stable population theory and the input-output production model are 
linear, but the more significant similarity is that both have simple parametric 
specifications. Classical stable population theory is not the only demographic model 
that allows us to calculate structural parameters and predict the evolution of a popula¬ 
tion from a single period’s data, just as the input-output model is not the only produc¬ 
tion model whose entire structure is revealed by a snapshot. If the good fairy who helps 
demographers revealed that each mating rule belonged to a particular one-parameter 
family (e.g., the harmonic mean or some other suitably restricted subset of the CES 
class), that revelation would enable us to calculate the parameters from one period's 
data. Similarly, if the good fairy who helps econometricians revealed that the underly¬ 
ing technology belonged to a particular one-parameter family (e.g., Cobb-Douglas or 
some other CES with a known elasticity of substitution), we could calculate the parame¬ 
ters from a snapshot. In each case, however, different revelations used to analyze the 
same data yield different predictions. 

9 In the generalized version of the model in which unions can persist for more than 
one period (Poliak 1987 b), there is a fourth primitive requiring a behavioral explana¬ 
tion: the schedule spedfying the probabilities that unions of each type will end in 
desertion or divorce. 
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vict uals or families facing economic and biological constraints; the 
analysis of these decisions is the subject matter of the economic theory 
of fertility. The mating rule is more complex analytically than the 
mortality schedules or the birth matrix because it reflects not only 
individual behavior but also the interactions of individuals in the mar¬ 
riage market. Because the mating rule is a ‘‘reduced form” repre¬ 
senting the equilibrium that corresponds to some unspecified set of 
“structural” equations, treating it as a primitive in the BMMR model is 
especially problematic. 

One might think that treating the mating rule as a primitive in the 
BMMR model is like treating aggregate excess demand functions as 
primitives in general equilibrium analysis, but the analogy is mislead¬ 
ing. Economics possesses a highly developed theory relating reduced- 
form excess demand functions to a structural model in which max¬ 
imizing economic agents make choices subject to appropriate 
constraints. Demography, on the other hand, lacks highly developed 
behavioral theories relating reduced-form mating rules to a structural 
model of marriage market equilibrium. 10 

To summarize: Classical stable population theory is parsimonious 
both because it allows us to use well-known, powerful mathematical 
techniques to investigate existence and dynamic stability and because 
it allows us to infer a population’s equilibrium age structure and its 
dynamic behavior from very little data. The BMMR model, which 
allows fertility rates to depend on the population’s age-sex structure, 
is more complex analytically and more demanding in its data require¬ 
ments. In return for these extravagances, the BMMR model waives 
demography’s two-sex problem and provides a framework for ad¬ 
dressing the marriage squeeze and other important issues in demog¬ 
raphy and economics that require a two-sex model. 


Appendix 

Local Stability 

An equilibrium of the BMMR model is locally stable if, in a neighborhood of 
that equilibrium, the mapping <j>(F, M) is nondecreasing in all its arguments 

f ,. 10 The foundation for a structural theory of the marriage market has only recently 
Seen laid. A substantial literature now exists on matching models, following the line of 
#»tysis begun by Gale and Shapley (1962) in their celebrated paper and Becker (1973, 
1981). Mortensen (1988) provides an accessible recent survey and references to the 
matching literature. Lam (1988) and Stapleton (1988) analyze marriage in models with 
household public goods. Assuming “transferable utility,” Lam examines the differing 
effects of gains from marriage attributable to specialization (in household production 
or the market) and to joint consumption (of household public goods). Using a "hedonic 
price” approach, Stapleton analyzes marriage market equilibrium under the assump¬ 
tion that individual characteristics vary continuously, in contrast to matching models, 
which assume a discrete distribution of individual characteristics. 
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and strictly increasing for individuals ofat least two adjacent ages. This prop¬ 
osition is a corollary to a global stability theorem that Arthur (1981) estab¬ 
lishes for a one-sex, linear demographic model. The stability argument 
sketched here is not self-contained, but only indicates the necessary connec¬ 
tions with Arthur’s. 1 focus on newborn females; a parallel argument holds 
for males. 

Continuity implies that, if the initial population vector is sufficiently close to 
an equilibrium, then the population vector n periods later will also be close to 
that equilibrium. Thus it suffices to make the stability argument in terms of 
the population vector (F”, M") instead of (F 1 , M 1 ). This is advantageous be¬ 
cause, regardless of how unbalanced the sexes were in the initial population 
vector, after n periods the sex ratio within each cohort must be “balanced,” 
that is, consistent with the mortality schedules and the secondary sex ratio. 
Hence, the female population vector is a sufficient statistic for the entire 
population vector, and we can reduce the domain of the newborns function 
from 2n to n dimensions: 


F\ = A IF'C'.F'f' . F‘ n - 1 ). 

Using the mortality schedules, we can express female births as a function not 
of the current female population vector but of newborn females in the previ¬ 
ous n periods: 

f\ = A*[f'r‘. f',- 2 , • ■ •. FT”]. 


The function A* is homogeneous of degree one in its arguments. Hence, we 
may adopt Arthur’s device of dividing both sides of the function by (1 + r)': 


F\ 

(1 + r)‘ 


A* 


' FT' 

. di + r)‘ 


1 + r FT 2 (1 + rf 
1 + r' (1 + r)‘ (1 + rf 


FT" (1 + r)" 

(1 + r)‘ (1 + r) n . ‘ 


Replacing F‘i/(1 + r) 1 by the new variables F', we can write 


ft = _ F‘~ n , r]. 


Arthur calls {F 1 , F‘~ l , . . . } a "growth-corrected” birth sequence. The func¬ 
tion A* and, hence, the function A** are homogeneous of degree one in 
(F‘~', .... so Euler’s theorem implies 



*-1 


d\** 

dF‘- k 


F ,_ *. 


Treating the coefficients in the Euler’s theorem expression as constants, we 
have an expression analogous to Arthur’s linear expression (8), which we can 
use to investigate local stability. 

To establish stability, it suffices to show that, for some value of r, the 
growth-corrected birth sequence becomes constant over time. The essence of 
Arthur’s proof is a demonstration that the dynamic process can be viewed as 
one of averaging growth-corrected birth sequences and that averaging causes 
a contraction of the extreme values in past cohorts. Following Arthur, we pick 
r to be a nontrivial equilibrium, t, which, in the BMMR model, need not be 
unique. For an equilibrium r, a constant growth-corrected birth sequence will 
reproduce itself; letting f~ l = F‘~ 2 = ... = F‘~” = z, we have 
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Z 



54 .** 
a F‘~ k z ‘ 


so the coefficients in the Euler’s theorem expression sum to one. Applied to 
this averaging process, Arthur’s argument implies that an equilibrium of the 
BMMR model is locally stable provided that, in a neighborhood of the equilib¬ 
rium, these coefficients are all nonnegative and that they are strictly positive 
for females of at least two adjacent ages. 11 This will be the case if, in 
a neighborhood of equilibrium, the function 1 (F 1 , M 1 ) is nondecreasing in 
all its arguments and strictly increasing for individuals of at least two adja¬ 
cent ages. 12 
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Alternative Common-Value Auctiori 
Procedures: Revenue Comparisons 
with Free Entry 


Ronald M. Harstad 

Virginia Commonwealth University 


The logic of revenue comparisons for different types of common- 
value auctions is substantially altered if the number of participants, 
rather than being fixed, responds endogenously to the expected 
profitability from participating. In a thoroughly symmetric model, a 
seller may prefer that competition be indirect: an auction procedure 
in which fewer participants are needed to drive the expected 
profitability from participating down to the level obtainable in other 
auctions in the economy can attain higher expected revenue if a sale 
is sufficiently likely. This insight allows a complete revenue ranking 
of standard auction procedures, with endogenous entry. 


I. Introduction 

Questions of revenue comparison across auction procedures domi¬ 
nate the theoretical and empirical literature on auctions (see McAfee 
and McMillan 1987a). The most studied theoretical model, the inde¬ 
pendent private-values model, exhibits revenue equivalence across a 
wide variety of auction procedures (Vickrey 1961; Milgrom and 
Weber 1982, pp. 1092-93). That model presumes that each bidder 
knows with certainty the auctioned asset’s value to him, an unlikely 
description for auctions of durable or resalable goods or of assets of 
uncertain quality. For these, the common-value model may be more 


I thank, without implicating, Doug Davis, Don Hausch, Charlie Holt, Dan Levin, 
Sherwin Rosen, Michael Rothkopf, and participants in the Microeconomics Workshop 
at the University of Virginia for helpful comments. 
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useful: it assumes that the asset has a common but unknown value, 
and bidders have private, imperfect information about this value. For 
common-value auctions (generalized), Milgrom and Weber (1982) 
provide expected revenue rankings across several standard auction 
procedures, as well as impacts on revenue of such seller choices as 
publicly announcing information correlated with the auctioned asset’s 
value. 

Revenue comparisons cited are based on symmetric Nash equilib¬ 
rium outcomes for an exogenously given number of bidders. These 
analyses thus proceed as if no other auction (or other transaction) 
may offer a substitutable economic opportunity. An alternative ap¬ 
proach, taken here, assumes that the number of competing bidders 
corresponds endogenously to expected profitability from competing. 
In symmetric equilibria of multistage games in which the last stage 
will be a common-value auction, with the number of bidders deter¬ 
mined in a prior participation stage, inferences can be drawn that 
depend on the auction procedure only through the equilibrium num¬ 
ber of participants. When fruition (i.e., a sale occurs) is sufficiently 
likely, an extension of the common-value revenue comparisons in 
Milgrom and Weber is obtained. With endogenous entry, however, a 
perhaps troublesome conclusion emerges: a seller often prefers an 
auction procedure because it generates fewer participants. Frequently 
voiced desires by sellers to generate larger numbers of bidders may 
need to be reevaluated. 1 (All results discussed here have a corre¬ 
sponding interpretation in procurement auctions.) 

A few recent papers have adopted aspects of an implicit bidder 
participation decision, with an endogenous number of participants. 
In private-values auctions with entry, a seller’s best reserve price is his 
own asset value (see Engelbrecht-Wiggans [1987] and Samuelson [in 
press] for examples). Milgrom (1981), for second-price auctions, and 
Matthews (1984) and Hausch (1988), for first-price auctions, consider 
special examples of common-value auctions. Extreme examples in 
Engelbrecht-Wiggans and Weber (1979) and Engelbrecht-Wiggans 
(1988) resist classification. Models with an uncertain (but exogenously 
determined) number of bidders appear in McAfee and McMillan 
(19876), Matthews (1987), and Harstad, Kagel, and Levin (in press). 


1 This model assumes noncooperative bidders no matter what their number. A seller 
who believed that collusion could be avoided only by attracting many bidders may face 
a trade-off between lower chances of collusion and higher expected revenue given 
noncooperative behavior. A belief that collusion could be avoided either by many 
bidders or by a sealed-bid auction may help in part to explain the behavior of the U.S. 
Forest Service, which auctions timber harvesting rights, employing both English auc¬ 
tions and first-price auctions, with an apparent selection btais that may relate to the 
anticipated number of bidders (Hansen 1986). 
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II. Bidder Participation '' 

Imagine a model in which potential bidders choose among a variety of 
auctions and other uncertain economic opportunities in which to in¬ 
vest their time and money. We shall examine a segment of the exten¬ 
sive form of such a model, relating to a particular auction. An indivis¬ 
ible asset with stochastic value V (to any bidder) is sold. A subset of 
potential bidders will participate, and a subset of participants will 
become actual bidders (the latter distinction is inessential unless an 
entry fee is imposed). Each participant i privately observes a signal X, 
about V. The joint distribution of the affiliated variables (V, {X,}) is 
common knowledge and is symmetric in the {X,}. 2 For example, V 
could be uniform on [L, H], with the X,’s independent and uniform 
on [V ^ Q, V + Q], where Q is known. We shall refer to this exam¬ 
ple repeatedly, but simplifying it by considering only cases in which 
L + Q < X, s R - Q. 

The game segment unfolds as follows. First, the seller announces 
an auction procedure F = (/, e f , r f ) G SF x R +, where/is an auction 
type (e.g., a first-price auction with none of the seller’s private infor¬ 
mation revealed), S' the set of auction types, e f an entry fee, and iy a 
reserve price. Second, each of a pool of potential bidders selects a 
probability of becoming a participant in this auction, based on F. 
Participation has two consequences: observing a signal, as mentioned 
above, and incurring a participation cost c > 0. This cost is likely to 
vary across auctions, but c is the same for all potential bidders in a 
given auction and is invariant to the procedure by which the auction is 
run. Unlike the entry fee, c does not accrue to the seller; it should be 
viewed as a forgone profitable opportunity (time consumed or inabil¬ 
ity to participate in another auction occurring elsewhere). 3 Third, 
each participant selects a probability of becoming an actual bidder, 
based on the signal observed and F. Each actual bidder pays the entry 
fee to the seller. Fourth, actual bidders select bidding strategies for 
the auction type /. The high bidder obtains the asset and pays the 


1 Any pair of affiliated variables satisfies the monotone likelihood ratio property, so a 
higher value for one makes a higher value for the other more likely. Independence 
conditional on V is a special case of affiliation (see Mitgrom and Weber 1982, p. 1098 
ff.). 

9 This consideration is missed if a view of substitute auctions is not at least implicitly 
present. Unwillingness of an additional potential bidder to participate need not imply 
zero net expected profit since participation costs may include expected profit opportu¬ 
nities forgone to participate in this auction (see Engelbrecht-Wiggans and Weber 1979). 
For example, suppose that a “wildcat" petroleum exploration firm has a limited quan¬ 
tity of trusted executives for pre-auction information gathering and processing and 
bidding strategy determination. Such a firm may view both a U.S. offshore sale and a 
North Sea offshore sale occurring in the same month as profitable opportunities; 
maximizing expected profit may imply focusing on one and forgoing the other. 
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seller a price determined by the auction type, the reserve price, and 
the bids submitted. Potential bidders are assumed to approach any 
auction with the objective of maximizing expected profit; that is, all 
are risk neutral. 4 

The solution concept applied is completely symmetric: each poten¬ 
tial bidder selects the same probability of participating; in the event of 
participation, each selects the same function of the signal observed to 
determine the probability of actually bidding; each selects the same 
bidding strategy in the event he actually bids; and these selections 
constitute a Bayesian Nash equilibrium. 

Under these conditions, a seller does not choose among auction 
procedures with the number of bidders invariant to his choice. 
Rather, the expected number of participants will vary with the auc¬ 
tion procedure F so that expected profit equals participation cost. To 
formalize, let N be the number of potential bidders, n(F) the equilib¬ 
rium expected number of participants, and a[F, n{F)] the equilibrium 
expected number of actual bidders. (Below, the dependence of a on n 
will be suppressed.) Then n(F)/N is the equilibrium probability of 
participation. To simplify, N > n(F), surely true for large enough N. 

The asset being auctioned is worth V; the expected price paid by 
the winning bidder, in symmetric equilibrium given auction proce¬ 
dure F, a actual bidders, and n se a participants, conditional on V - v, 
is represented as p(F, a, n, v). As a convenient normalization, all 
prices—V, the X„ e { , ry, and profit and revenue calculations below— 
are measured in units of participation cost, by setting c = 1. For our 
example, let F = 2 represent a second-price auction (without public 
information), with = ry = 0. When the uniform distributions men¬ 
tioned are used, the symmetric equilibrium bid function is b(x, n) = 
x — [Q(n — 2)/n]. The high bidder pays a price equal to the second- 
highest bid; the expected value of the second-highest signal, given 
V = v, is v + [Q(w - 3)/(n + 1)]. So, for the example, p(2, a, n, v) = 
v - [2Q(n - l)/n(n + 1)]. 

The expected profit in the event of winning, gross profit in that c 
and ef are not taken into account (but ry is incorporated), is the ex¬ 
pected difference between V and the conditional expected price: 

w(F, a, n) = £{V - E[p(F, a, n, -)l^]}. (1) 

For the example, w(2, a, n) is simply 2Q(n - l)/n(n +1). The equilib- 

* Opportunities to obtain similar assets in other auctions allow, a potential bidder to 
diversify his portfolio of risky returns from participation in various auctions and other 
uncertain economic opportunities. Revenue comparisons for a particular seller are 
greatly stmfiified if a potential bidder’s participation and bidding strategies are separ¬ 
able across auctions; separability results only if bidders' underlying (i.e., across-auction) 
utility functions exhibit risk neutrality or constant absolute risk aversion. 
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rium condition that the expected cost of participation equals the ex¬ 
pected net profit is 


, , fl(f) r _ w[F , a (F), n(F)] a(F) 
n(F) f a(F) n(F)’ 


VF e 3f x R 2 + . 


( 2 ) 


For the example, (2) becomes 1 = 2Q(n - l)/n 2 (n +1). 

Notice which variables in equation (2) are determinable and what 
the source for the calculation is. The fundamental form of a(•) could 
be calculated as follows: for a given F and an arbitrary n, there would 
be a subset of the support of X, consisting of signals for which the 
expected profitability of winning justified paying the entry fee and 
bidding at least the reserve price. The probability that X, £ X serves to 
determine the functional form of a(F); the probability that a partici¬ 
pant pays the entry fee is then symmetric: a(F)/n(F). At the second 
stage, when potential bidders decide whether to participate, signals 
have not yet been observed, so the chances of winning are symmetric: 
1/ 71 (F) since the a(F) terms in (2) cancel. The functional form of w(-) 
for given F and n is a common knowledge calculation, so (2) defines 
the functional form of n( ). Assume, as is natural, that w[F, a(n), n] is 
decreasing in n for all F E 9 x R'i. 

A remark on perspective: Only familiar assumptions in the auction 
literature are used here. (That changing the auction procedure can¬ 
not per se alter the asset’s value is assumed by Milgrom and Weber 
[1982]. That changing the number of bidders does not alter the asset’s 
common value is assumed by Milgrom [1981, sec. 5]. Both assump¬ 
tions are made by French and McCormick [1984], Matthews [1984], 
and Hausch [1988]. One may expect a higher number of bidders to be 
drawn to the auction of a more valuable asset, but this is a ceteris 
paribus heuristic.) Moreover, some standard assumptions are more 
appropriate to the questions addressed here than to antecedent uses. 
Revenue comparisons of auction procedures most naturally arise after 
the seller has adopted any changes that enhance the asset’s intrinsic 
value. Symmetric behavior seems the only sensible way to predict 
profitability of participation before the identity of participants is 
known. Finally, rationality is natural: it would be odd at best to investi¬ 
gate auction participation decisions that were based on the expected 
profitability attainable from the winner’s curse or other irrational 
bidding behaviors (likely a better prescription would be, simply, not to 
participate). 


III. Inferences for Revenue Comparisons. 

Expected revenue for auction procedure F, in symmetric equilibrium, 
is the sum of the expected price paid by the winning bidder and entry 



JOURNAL OF POLITICAL ECONOMY 


426 

fees paid by ail actual bidders: 

fl[F, a(F), n(F)] = E{p[F, a(F), n(F), •]} + a(F)e f 

= E(E{p[F, a(F), n(F), ]|V}) + a(F)e f 

= E(E{p[F, a(F), n(F), • ]\V} - V + V) + a(F)e f 

= E[V - n(F) - a(F)ef] + a{F)ef 

= E(V) - n(F), (3) 

with substitution from (1) and (2). Thus the seller expects to receive 
asset value less aggregate participation costs. 5 The seller is of course 
also concerned with the likelihood of fruition, the event that at least 
one participant pays any entry fee and submits a bid no less than the 
reserve price. For N potential bidders, given participation probability 
ir and probability a that a participant actually bids, the fruition proba¬ 
bility is 

N 

9>(a, rr.TV) = 1 - £ (1 - a)‘(jvMl - (4) 

i = 0 

It is also convenient to let tf(ot, it, AO = a[l - 2P(a, ir, N - 1)], which 
is the chance that an assumed participant submits the only bid. 

Observation 1. Any two auction procedures F and F' with the 
same equilibrium number of participants attain the same expected 
revenue (a) assuming fruition or (6) given zero reserve prices and 
entry fees. 

Proof. Part a follows from (3). For part b, each participant bids, so 
the event of zero actual bidders has probability {1 - [n(f, 0, 0)A¥]} 7V 
for both (/, 0, 0) and (/', 0, 0). Q.E.D. 

Thus revenue comparisons of two auction procedures with equal 
numbers of bidders depend crucially on whether equal numbers is an 
assumption or a characterization. Equal numbers is solely an assump- 
don for common-value auction revenue rankings in Milgrom and 
Weber (1982). 

Observation 2. If aucuon procedure^' yields a participant higher 
expected profit than auction procedure F, when both are evaluated at 
the equilibrium number of participants for F, then F' will have a 
higher equilibrium number of participants than F. 


8 for the special cases they study, a corresponding result is djscussed by French and 
McCormick (1984), is found by Hausch (1988), can be calculated in Milgrom (1981), 
and is founi|a^ an asymptoiB!^approximation in Matthews (1984) (where the number of 
bidders is fi^t necessarily an equilibrium level, but the participation costs are). 
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Proof. The proposition asserts w[F', •, n(F)] > w[F[ •, n(F)] n(F') > 

n(F), which is a trivial consequence of (2). Q.E.D. 

Observation 2 requires no particular relation between («y, rf) and 
(tp, Tf). Notice that any impact of entry fees on reducing the number 
of actual bidders may not be direcdy relevant; it is the number of 
participants the seller may wish to reduce. 

Observation 3 . Equilibrium expected revenue is inversely related to 
the equilibrium number of participants (a) given at least one actual 
bidder or (b) whenever fruition is sufficiently likely. 6 

Proof. Follows from (3). Q.E.D. 

Observation 4. Let auction form F attract fewer participants than 
F', in equilibrium. A fruition probability sufficient for observation 3 
to apply is 

J! § L - *) 2 IW> - • -if 1 -. 4 ( 5 ) 

Proof. Naturally, the ratio a(F)/n(F) is nonincreasing in n(F) (see 
Milgrom and Weber 1982, theorem 19), so it suffices to treat the case 
in which this ratio is a constant a £ (0, 1], For this case, the function 
H(ri) - [£(10 - n]2P(a, n/N, N) extends equilibrium expected reve¬ 
nue from its domain of definition (taken to be {n £ R|3 £|n = «(£)}) to 
R. For n* defined by equality in (5), dHIdn < 0 on (n*, N). Q.E.D. 

Sufficiently likely fruition can often be obtained with few partici¬ 
pants, particularly if c is large relative to E(V), or if N is relatively 
small. For example, if N = 20, then for the example above with 
[L, H] = [20, 80] and Q = 17.15, (8, 0, 0) (8 an English auction) is an 
expected-revenue-maximizing auction, drawing 3.67 participants on 
average. However, with Q = 11.78, (2,0,0), the second-price auction, 
draws 3.67 participants and higher expected revenue than the En¬ 
glish auction. For N = 6, [L, H] = [12, 48], and Q = 11.38, («, 0, 0) 
draws an expected 2.91 participants, and the seller does not want 
more. 

If a seller can switch to an auction procedure that will attract fewer 
participants, then each participant will have a higher chance of win¬ 
ning and so will in equilibrium settle for a lower expected profit in the 
event of winning. The winner’s lower expected profit (as long as there 


6 Observations 1 and 2 can readily be extended to the “general symmetric model” of 
Milgrom and Weber (1982). The forces at work in observation 3 are also more general: 
in equilibrium with endogenous entry, given fruition, revenue will be nearer V the 
smaller the equilibrium number of participants. However, outside the common-value 
model, an additional consideration runs counter to this force: with more participants, V 
may be higher (in proportion to a first-order statistic). 



4*8 JOURNAL OF POLITICAL ECONOMY 

u a winner) means that the seller receives an expected revenue nearer 
the asset’s expected value. 7 

A natural comparative static question is whether a seller with a 
costless opportunity to reduce c gains by doing so. For given F, equi¬ 
librium n(F) must rise as c falls. By (3), expected revenue will rise if 
n(-) responds to c inelastically. Such an inelastic response follows from 
(2) and w(-) declining with n when t f = 0; when e f > 0, the additional 
assumption that a(-, n) is increasing in n is sufficient. 

The following revenue comparisons for common-value auctions 
with endogenous bidder participation can be inferred. Each assumes 
the condition in observation 4 and follows from the first three obser¬ 
vations above and the theorems from Milgrom and Weber (1982) 
indicated in parentheses. (1) Expected revenue for an English auction 
is not less than that for a second-price auction (theorem 11). (2) Ex¬ 
pected revenue for a second-price auction is not less than that for a 
first-price auction (theorem 15). (3) Publicly announcing any informa¬ 
tion the seller has that is affiliated with asset value cannot lower, and 
may raise, expected revenue for each of the three auction types dis¬ 
cussed above (theorems 8, 9, 12, 13, 16, 17, and 18). (This does not 
say whether the seller should reveal or conceal the number of actual 
bidders.) (4) For each of the three auction types, higher entry fees and 
corresponding lower reserve prices generally raise expected revenue 
(theorem 19). 

Thus the seller’s preference for an English auction, with public 
information announced, generalizes. To the extent that entry fees can 
reduce the number of participants without measurably reducing the 
likelihood of fruition, this will enhance expected revenue. With en¬ 
dogenous bidder participation, however, the reason for these prefer- 


7 Econometric studies relate higher revenue to a large number of bidders (McAfee 
and McMillan 1.987a; Brannman, Klein, and Weiss 1987). A stated objective in the 
Outer Continental Shelf Lands Art amendments of 1 978 (92 Stat. 629) is to increase the 
number of bidders in offshore mineral rights auctions. The U.S. Geological Survey 
conducts these auctions, many tracts at a time. When bids are in, the computer al¬ 
gorithm used to determine whether to award a "wildcat" tract does so automatically if 
the tract drew at least three bids. Some possible causes for this disparity are that (a) the 
econometric studies may have inexact proxies for asset value, ( b ) participants may not 
be employing risk-neutral symmetric equilibrium strategies (see Kagel and Levin 1986; 
Kagel, Levin, and Harstad 1988), (c) participation decisions may not be equilibrium 
choices (the equilibrium participation decisions posited here would not be best re¬ 
sponses if rivals’ bidding were not in equilibrium), and ( d) this model may omit some 
key element of the markets studied. 

* The precise statement is cumbersome. Suppose that (f, <y, r f ) is regular (zBK.i' > z 
=> z' € 8f) when evaluated aMi' = »(/, t' f , r' f ), where tf < tf, and r' f has been set so that 
a[(f, tf, Tf), n] = a{(/, t'f, r'f), n] at both n = n’ and n = n(f, tj, Tf). (Milgrom and Weber 
refer to this as having the same screening level.) Then (/, tf, r'f) is regular at n' but 
attains expected revenue that is at most that of (f, t f , rf). 
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ences is different: these policies reduce the equilibriirtn level of compe¬ 
tition. 
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Confirmations and Contradictions 


The "Oral Tradition” at Chicago in the 1930s 

Frank G. Steindl 

Oklahoma Stale University 


In “The Quantity Theory of Money—a Restatement” (1956), Fried¬ 
man states that “Chicago was one of the few academic centers at which 
the quantity theory continued to be a central and vigorous part of the 
oral tradition throughout the 1930’s and 1940’s, where students con¬ 
tinued to study monetary theory and to write theses on monetary 
problems” (p. 3; italics added). 

In a celebrated article in the inaugural issue of the Journal of Money, 
Credit and Banking, Patinkin (1969) disputes the existence of the oral 
tradition, marshaling for his indictment a host of evidence—class 
notes, theses, and writings of Chicago students and faculty. Subse¬ 
quently, Johnson in his Ely lecture (1971) charges Friedman with “the 
invention of a University of Chicago oral tradition that was alleged to 
have preserved understanding of the fundamental truth among a 
small band of the initiated through the dark years of the Keynesiart 
despotism” (pp. 10-11). 

The issue of the existence of an oral tradition was joined the follow¬ 
ing year by Patinkin and Friedman in this Journal’s special issue on 
monetary theory. Friedman (1972, p. 932) denied that there was no 
such tradidon, namely, 

[Patinkin's and Johnson’s charge] would indict my percep¬ 
tion, or integrity, or scholarship, but it would in no way 
contradict the existence of an important Chicago tradition 
in the field of money that had a great influence on subse¬ 
quent work in monetary economics and on my own work in 
particular. . . . 

Whether I conveyed the flavor of that tradition or not, 
there was sucffa tradidon; it was significantly different from 

l 

[Journal of Political Economy, 1990, vol. 98. no 2) 

C 1990 by The Univeraity of Chicago. All rights reserved. 0022-3808/90/9802-0010101.50 


430 




CONFIRMATIONS AND CONTRADICTIONS 431 

the quantity theory tradition that prevailed at other institu¬ 
tions of learning. 

Was there an oral tradition of the quantity theory at Chicago or 
not? One important piece of evidence in support of such a tradition is 
Henry Simons’s enthusiastic review (1935) of Lauchlin Currie’s The 
Supply and Control of Money in the United States (1934). First, it is impor¬ 
tant to note that the quantity theory is an essential component in 
Currie’s analysis, as can be seen from his opening remarks: ‘To write 
a book at this time dealing primarily with the supply of money and 
touching only incidentally on the rate and character of spending of 
money is to invite hostile criticism. Even kindly disposed critics may be 
inclined to dub such a treatment inadequate and superficial, while 
those not so kindly disposed may be tempted to damn it with the 
deadly accusation ‘quantity theory’” (p. 3). After stressing the impor¬ 
tance of the behavior of the money stock, Currie presents his annual 
money stock estimates for 1921-33, which show a 24 percent decline 
for 1929-33. 1 

The existence of a Chicago oral tradition can be seen in Simons’s 
four-page review, which begins, “This book should have a significant 
and salutary effect, both on professional opinion, and on college 
teaching. It . . . expounds clearly a set of views which, while firmly 
established in the ‘oral tradition’ of some schools [leaving little doubt 
that Chicago is one of them], are [stc] meagerly represented in the 
accessible literature” (1935, p. 555). His subsequent comments are 
clearly in the quantity theory framework, including the question of 
the definition of money and what he sees as Currie’s implied approval 
of price-level stabilization. 2 He goes on to criticize Currie’s empirical 
notion of money as being too narrow, for his failure to “face the 
fundamental questions as to the proper objectives or ideal rules of 
policy” (p. 557), and for his being “guided (against all experience, one 
might say) by a faith in monetary dictators—in authorities with large 
discretionary powers” (p. 558). Nonetheless he concludes that “this 


1 Currie’s notion of money is a medium of exchange view—narrow money. Though 
markedly different in definition and construction from Friedman and Schwartz's Ml, 
the two series are remarkably similar, as Patinkin (1979, p. 223) noted. The simple 
correlation coefficient between them is .89, and for annual changes it is .85. Remarkably, 
it is .93 for annua] rates of growth! 

* Simons's reading of Currie’s implied approval of price-level stabilization as a proper 
goal of monetary policy underscores a classic problem in textual exegesis, which for 
want of a better term may be called the "believing is seeing” view, His review appeared 
three issues prior to his famous article "Rules versus Authorities in Monetary Policy," 
which of course favored a price-level stabilization rule. Robertson's (1935) review, on 
the other hand, has Currie’s proposals "coloured by the inflation-bacillus whose pres¬ 
ence one cannot help detecting in Mr. Currie's blood, and which leads him to assume 
that the Government action called for will almost always be in an expansory direction” 
(p. 130). Robertson's reading in fact more accurately reflects Currie's position. 
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book is one of the few economics treatises of recent years which 
has left the reviewer genuinely grateful and indebted to its author” 
(p. 558). 

Simons's use of the phrase “oral tradition” and, more important, 
the tone of his review appear to be clear evidence that such a tradition 
existed at Chicago, at least in the first half of the 1930s. To be sure, 
that tradition, as Patinkin noted in his Simons lecture (1979, p. 224, 
n. 48), or at least Simons’s perception of it, has little use for empir¬ 
ical work: "For critical students, however, Dr. Currie’s inductive 
verifications will be largely gratuitous—although everyone will be 
grateful for the excellent statistical compilation and analysis. In gen¬ 
eral, the author’s fundamental insights are so sound that failure of 
statistical confirmation would only indicate error or inadequacy in the 
statistics” (Simons 1935, p. 556). The nature of that tradition and 
whether it remained reasonably the same into the 1940s and 1950s is 
another issue, one on which there may be no reasonable consensus. 3 
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Invariant Valuation When Tax Rates Change 
over Time 

Andrew B. Lyon 
University of Maryland 


Samuelson (1964) proved that if and only if depreciation for tax 
purposes is equal to economic depreciation will any two taxpayers 
with different tax rates value an asset equally. 1 He referred to this 
condition as invariant valuation; that is, the valuation of an asset with 
a given stream of receipts is independent of the tax rate of the indi¬ 
vidual. Under this condition, the tax system does not provide any one 
taxpayer a greater or lesser incentive to undertake investment than 
any other taxpayer. Samuelson’s proof pertained only to the case of 
taxpayers with different, but constant, tax rates. As he noted, “All of 
this analysis presupposes a tax rate that is uniform over time for each 
person. Obviously, if a man is to be subject to different rates, with [the 
tax rate] being a function of time, his optimal decision will be dis¬ 
torted by this fact. A proper system of carry-forwards and carry¬ 
backs, which makes [the tax rate] average out to a constant and which 
takes account of just when a man pays the tax accruing to him, will 
avoid such distortions” (p. 605). 

Under many circumstances, taxpayers’ statutory tax rates may 
change over time. Individuals and corporations are subject to gradu¬ 
ated tax rates, minimum tax rules, and restrictions on tax losses. It is 
shown in Section I that if tax rates are allowed to change over time, 
but Samuelson’s other assumptions are maintained, economic depre¬ 
ciation continues to result in invariant valuation without the need for 
carryforwards or carrybacks. 11 That is, if tax rates vary across taxpay- 


I would like to thank Don Fullerton and Chuck Hulten for their helpful suggestions. 
This research is part of the National Bureau of Economic Research's program in 
taxation. 

1 See also King (1975), Stiglitz (1976), and Bradford (1981). A proof of Samuelson’s 
theorem indiscrete time, which provided the inspiration for this note, is Hulten (1988). 

* After writing this note, I found that Boadway and Bruce (1984) also make this 
observation and credit it to Sandmo (1979). In turn, Sandmo credits Nickell (1977). 
Sandmo’s discussion is closest to the spirit of this note, but he considers only the more 
limited case of exponential economic depreciation and his primary focus is on the 
nonneutrality of expensing when tax rates change over time. 

[Journal of Political Economy, 1990, vol 98, no, 2} 

• 1990 by The University of Chicago. AU right* rwrved. 0022-3808/90/9802-0001*01.50 


434 JOURNAL OF POLITICAL ECONOMY 

ers and over time, economic depreciation ensures that all taxpayers 
will value a given asset equally. 

In Section II, an assumption on the discount rate used by taxpayers 
is modified. With this change, economic depreciation no longer re¬ 
sults in invariant valuation. This result occurs whether tax rates are 
constant or vary over time. Conditions necessary to achieve invariant 
valuation under this alternative assumption are examined. 

I. Extension of Samuelson’s Theorem to Rates 
That Change over Time 

Following Samuelson, let the value of an asset at time t be V„ with net 
cash receipts N, and depreciation deductions D,. The asset is assumed 
to be productive for n years. The interest rate is i„ and interest is tax- 
deductible. Let the taxpayer’s tax rate t, be a function of time. Then 
the initial value of the asset to that taxpayer is 

y = (1 ~ Ti)JV| + T\D] + (1 - t 2 )N 2 + t 2 D 2 + 

1 + *i(l - T,) [1 + ii(l - T,)][l + t' 2 (l - T 2 )] * ( ij 

+ _ (1 - T„)A r „ + T„Z)„ _ 

[1 + t'i(l - T,)][l + i 2 (l - t 2 )] . . . [1 + i„( 1 - T n )]' 

The change in the value of the asset at time t is 

— V,+ ] = -(1 -- t t )i'V' + (1 - T t )N t + t ,D,. (2) 

Depreciation deductions are to be determined such that V, is inde¬ 
pendent of the tax rate. Let depreciation deductions D, equal V, — 
V,+ i, the change in the value of the asset to the taxpayer. It wifi be 
shown that with this set of depreciation deductions, valuation is inde¬ 
pendent of the tax rate. Thus D, corresponds to economic depre¬ 
ciation. 

Substituting for D, in equation (2) yields 

V, - V t+i = -i,V, + N t . (3) 

Since equation (3) holds for all t (and given V n+ i = 0 by assump¬ 
tion), V n = N„/(l + *n). T„_i = ( V„ + /V„_i)/(1 + j),..., and Vi = 

(V 2 + + t i). The value of the asset is simply the present value of 

all cash receipts discounted at the pretax interest rate. Equation (3) is 
independent of t<; thus the relation is satisfied for all taxpayers. 

Since the value of the asset at any {joint in time is identical for all 
taxpayers, invariant valuation is achieved. Each taxpayer receives de¬ 
preciation deductions D t = V, — V t + 1( which is economic depreciation. 
In contrast to SamuetsOn’s supposition quoted in the opening passage 
to this note, no carryforwards, carrybacks, or tax credits are required 
to yield invariant valuation when tax rates change over time. 
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TABLE 1 


Required Depreciation for Invariant Valuation 



Tax Rate Assumption 

After-Tax 

Discount Rate 

Tax Rates Constant 
over Time 

Tax Rates Vary 
over Time 


Case 1: 

Case 2: 

i,(l - t ( ) (debt finance) 

Economic depreciation 
(Samuelson) 

Economic depreciation 


Case 3: 

Case 4: 

r' (independent of t) 
(equity finance) 

Expensing or D, = N, 

D, = N, 


Case 5; 

Case 6: 

i,(l - t,*) = r( (firm* use 

Any method of deprecia- 

Any method of deprecia- 

both debt and equity 
finance; tax credits cal¬ 
culated relative to t*) 

tion 

tion 


Note. —In all cams tax rates may also vary across taxpayers. In cases 5 and 6. rf is common to all taxpayers. 


As pointed out by King (1975) and Stiglitr (1976), if investment is 
debt-financed, interest payments are deductible, and taxpayers re¬ 
ceive economic depreciation deductions, then no tax liability results 
for a marginal investment. The tax system is therefore in effect a 
lump-sum tax, applying only to inframarginal profits. It is essentially 
under these same conditions that Samuelson shows that valuation is 
invariant. In the case of an asset financed with debt, no taxes are paid 
on the earnings of the asset, so the asset will be valued equally by all 
taxpayers, regardless of what their tax rates are or whether these rates 
change over time. 

These results are summarized in table 1 as case 1 and case 2. Case 1, 
the case examined by Samuelson, assumes that tax rates vary across 
taxpayers but are constant over time. If investors’ after-tax discount 
rates equal i,( 1 — t ( ), then economic depreciation is required for 
invariant valuation. Under the same assumption on discount rates, 
case 2 extends this result to the case of tax rates that vary across 
taxpayers and over time. 

II. Further Results 

One crucial assumption required for these results, as well as for 
Samuelson's original proof with constant tax rates, is that the after-tax 
discount rates of investors vary by a factor of one minus the investor's 
tax rate. After-tax discount rates may be determined in this manner if 
investors deduct interest payments and are taxed on interest earned. 
This discount rate assumption may not be valid when firms with dif- 
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ferent statutory tax rates use equity to finance investment. If stock¬ 
holders of all firms require the same rate of return before personal 
taxes, then the equity-financed discount rate (net of corporate tax) 
will be equal across firms. Economic depreciation will not maintain 
invariant valuation in this case. 

Without the use of tax credits, two firms with different tax rates 
(which may change over time) but equal discount rates will value an 
asset equally if and only if depreciation deductions each period are 
equal to the cash receipts of the asset ( D, = N,). 3 Note that just as in 
the case of debt-financed investment considered in Section I, no tax 
revenue will be collected from investment unless there are inframar¬ 
ginal profits. Valuation is invariant because no tax liability results to 
the firm. 

The present value of depreciation deductions in the equity- 
financed case is equal to the original value of the asset. One might 
mistakenly conclude that expensing, which has the same present 
value, will also maintain invariant valuation. Recall that expensing 
with no deduction for interest yields invariant valuation when taxpay¬ 
ers have constant tax rates over time (see King 1975; Bradford 1981). 
However, expensing is not neutral across taxpayers if tax rates change 
over time. A taxpayer with a temporarily high tax rate when the asset 
is expensed will value the asset more than a taxpayer with a constant 
tax rate over time. 

Cases 3 and 4 of table 1 summarize these results under the assump¬ 
tion that the firms’ after-tax discount rate, rf, is independent of their 
tax rate. In case 3, when tax rates are constant over time, either 
expensing or depreciation deductions equal to cash receipts result in 
invariant valuation. When tax rates vary over time, case 4, invariant 
valuation requires that depreciation deductions equal cash receipts. 

Finally, it should be noted that tax authorities generally would not 
be able to identify whether a particular investment was financed with 
debt or equity. As shown above, economic depreciation allowances 
are required for debt-financed investment, if interest is deductible, 
and more generous depreciation allowances are required for equity- 
financed investment. 

A resolution to this problem requires that discount rates be inde¬ 
pendent of the source of finance. In general, this can be achieved only 
if all taxpayers are subject to the same tax rate. Let us assume that all 
firms face the same pretax rate of interest for debt finance, i„ and the 
^ tame required rate of return net of corporate tax for equity finance, 
rf, with i, 2 $ rf. Note that i,( 1 - t*) = rf for some t,* at each point in 
time. If all taxpayers were taxed at rate t*, then the after-tax cost of 

* This can be verified in a manner similar to eqq. <!)—(3). 
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debt finance would be equated with the cost of equity finance for all 
taxpayers, Similarly, if taxpayers face different statutory tax rates at 
any point in time, then a tax credit calculated relative to r* would be 
required. 4 Such a tax credit would equate period-by-period tax pay¬ 
ments for all taxpayers for any stream of depreciation allowances. 
Invariant valuation would then be achieved essentially by taxing all 
taxpayers at the same tax rate, t*. 

Cases 5 and 6 of table 1 summarize these results. When the source 
of finance is unobservable to the tax authority, a system of tax credits 
(or, alternatively, carrybacks or carryforwards with interest) is re¬ 
quired. This is true both for the case of constant tax rates and for tax 
rates that vary over time. 
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Passions within Reason: The Strategic Rote of the Emotions. By Robert H. Frank. 
New York: Norton, 1988. $19.95. 

The typical economic model contains assumptions concerning human prefer¬ 
ences that are clearly contradicted by most individuals' day-to-day experience. 
For example, the men and women I interact with do not consistently behave 
in a narrowly self-interested fashion, but rather are also affected by feelings 
of altruism, fairness, integrity, and other notions that are hard to fit into the 
standard paradigm. Numerous arguments have been put forth for why such 
observations do not warrant an overhaul of what constitutes the economic 
framework. I predict that after reading Robert Frank’s new book most read¬ 
ers will find almost all these arguments to be much less persuasive. 

Frank starts from the most primary of first principles. As with other 
species, the human species is the product of the evolutionary process known 
as natural selection. Put succinctly, natural selection is the competition among 
organisms to pass on genes into the next generation, and the traits that are 
selected are those that lead to success in this competition. Most previous 
authors who have thought along these lines have employed this perspective to 
justify the standard paradigm. Their logic is straightforward. Most of human 
evolution was characterized by a strong positive correlation between a family’s 
wealth and its number of children, and in fact many societies were polygy- 
nous and also exhibited a positive correlation between a man’s wealth and his 
number of wives. Since this implies that traits that lead to the maximization of 
wealth should be most favored, these previous authors have concluded That 
self-interested behavior should be the outcome. 1 

Frank argues that this logic is too simplistic in that in many situations self- 
interested behavior is actually self-defeating. That is, in many environments 
an individual whose behavior is ruled by his or her emotions can actually do 
better than an individual who continually weighs costs and benefits. The 
reason ts that the emotions can serve as a commitment device, and hence 
being ruled by them can allow an individual to behave in a manner that is time 
inconsistent in the absence of emotions. Consider, for example, the owner of 


1 Other than Frank, two authors who have used the evolutionary approach to derive 
predictions counter to the standard paradigm are Hirshleifer (1987) and Waldman 
(1988). Hirshleifer'presents an argument very similar to Frank's, which states that 
emotions may have evolved to solve problems of commitment or time inconsistency (for 
Other work related to the ideas in Frank's book, see Akerlof [1988], Gauthier [1985], 
Sen [1986], and Frank [1987]). Waldman (1988) provides an evolutionary rationale for 
the psychology literature, finding that men tend to systematically overestimate their 
own abilities. 

t t 
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a firm who would like to deter entry into his firm’s market. The owner could 
achieve this result by binding the firm to behave in a very aggressive fashion if 
entry were to occur. If he could do this and communicate to potential en¬ 
trants that he had indeed bound the firm in this way, then profits would be 
increased. However, in the absence of the ability to commit the firm’s future 
behavior, the standard game theory equilibrium is that entry occurs and 
the incumbent firm “acquiesces.” In other words, the incumbent's profits are 
lower than if a commitment device were available. 

Now suppose that there is no way for the owner to bind the future behavior 
of the firm, but that the owner of the firm is not always ruled by narrow self- 
interest. Rather, the owner is a vengeful person who would act aggressively in 
the event of entry even if such behavior entailed high costs to himself. Frank’s 
point is that, if the potential entrant can recognize that the owner haB a 
vengeful disposition, then entry will not occur and the individual ruled by his 
emotions will outperform the one who continuously weighs costs and benefits. 

At this point a number of readers might feel that they have found a funda¬ 
mental flaw in the argument. That is, for those well versed in the modem 
concepts of self-selection and incentive compatibility constraints, there seems 
to be a problem. Why doesn't the narrowly self-interested owner simply 
mimic the pre-entry behavior of the vengeful owner? Through such imitation 
he should be able to reap any potential returns from successful entry deter¬ 
rence without actually bearing the costs of acting aggressively if entry were in 
fact to occur. Frank answers this criticism through an elegant use of one of 
the principles of evolutionary theory. The existence of individuals who pre¬ 
tend to be vengeful reduces the return to actually being vengeful because the 
presence of imitators would ultimately cause vengeful looking behavior to 
seem less threatening. Evolutionary theory states that in such an environment 
there is a tendency for the initial trait to evolve in a manner that makes 
mimicry difficult. Further, as described by Frank in some detail, this rea¬ 
soning is exactly consistent with how the emotions in humans actually reveal 
themselves. For example, facial expressions that reveal emotional states of 
mind are only partially under voluntary control, and hence the imitation of 
emotional states is not an easy task. The result is that few of us can success¬ 
fully misrepresent our emotional states of mind on a consistent basis. 

In summary, there are two steps to Frank’s argument. First, natural selec¬ 
tion may have favored individuals who are in some instances ruled by their 
emotions because emotional behavior can be important for solving problems 
of commitment or time inconsistency. Second, if natural selection were to 
have worked in this way, then we should also expect emotional states of mind 
to reveal themselves in a manner that makes imitation quite difficult. In turn, 
the evidence indicates that this is how emotional states of mind are indeed 
revealed. 

After having developed this basic argument, Frank proceeds to show how 
the perspective can help us understand a wide range of phenomena, many of 
which pose severe problems for the traditional self-interest paradigm. The 
book applies the perspective to why many individuals seem to exhibit a pref¬ 
erence for fairness in economic transactions, why cooperation does not seem 
to be restricted to repeated interactions, and generally why we see a host of 
behaviors that are hard to understand in the context of the standard para¬ 
digm. I found these applications to be in general well reasoned, well pre¬ 
sented, and quite persuasive. 

There is one additional theme that runs throughout the book. Specifically, 
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the extent to which behavior is ruled by the emotions—and, more important, 
which emotions are of the most consequence—is largely determined by the 
socialization process and the nature of the environment; that is, culture mat¬ 
ters. In relationship to the rest of the book, I found the discussion of this issue 
somewhat disappointing. Frank treats the idea that culture is important as a 
rather obvious point and provides little systematic discussion. Although the 
basic point may be quite obvious, there is a related question quite important 
for Frank's argument that does not have such an obvious answer: How do 
cultures themselves evolve? Economists have paid little attention to this issue, 
but it is drawing increasing attention in some of the other social sciences. An 
excellent book on the topic is Boyd and Richerson (1985), which surveys the 
literature to date and provides a variety of specific mathematical models of 
the cultural evolutionary process (see also Lumsden and Wilson 1981). I felt 
that Frank’s book would have benefited from a chapter that links his ap¬ 
proach to the burgeoning literature in this area. 

Overall, Frank has written an important book that is well worth reading. It 
challenges the traditional paradigm in a logically consistent and quite persua¬ 
sive fashion. I would recommend it to all who are truly interested in the real 
factors that underlie human behavior. 

Michael Waldman 

University of California, Los Angeles 
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In a way the rjplt remarkable feature of this book on mineral economics is its 
authorship, which includes two leading economists, a leading geologist, and a 
materials specialist. This is not a collective work; the authors are listed alpha- 
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betically, and the reader is not told who wrote what. Presumably all four take 
responsibility for the conclusions. 

The book starts out by contrasting two views of the role of resources in 
future economic development. One view, labeled “pessimistic,” considers it 
obvious that low-cost sources of many minerals will be exhausted within the 
foreseeable future and that the resulting switch to high-cost sources will have 
a devastating impact on economic growth. This view is widespread among 
geologists and is shared by many laymen; one of the few economists who held 
it (at least in his youth) was Jevons (1865). The second, “optimistic," view is 
popular mostly among economists: it points to the experience of the last two 
centuries as evidence that the depletion of minerals has not had an adverse 
effect on growth until now, and indeed has not even raised the relative prices 
of most minerals. Thus Herfindahl (1959) found that the deflated price of 
copper had not risen significantly between 1870 and 1957, and this approxi¬ 
mate “trendlessness” has continued in the 30 years since then. In the optimis¬ 
tic view, the processes of anticipatory pricing, substitution, and technological 
change that presumably brought about this favorable outcome may be ex¬ 
pected to operate in the future as well. The four authors see their study as “a 
step toward resolving the debate between geologists and economists” (p. 3). 

The difficulties in this worthy enterprise are inadvertently brought out by 
the opening sentence: “Few will quarrel with the statement by the eminent 
geologist T. S. Lovering ... that ‘rich minerals are a nation’s most valuable but 
ephemeral possession’ " (p. 1). This variation on the pessimistic theme may be 
an article of faith among geologists, but economists who consider the facts will 
be less easily convinced. As to the importance of minerals, there no doubt 
exist a few countries with small populations and vast oil fields whose riches are 
mostly under the ground; Kuwait comes to mind. In the wealth of most 
nadons, however, minerals are surely dwarfed by human and physical capital. 
Whether minerals are especially ephemeral is equally doubtful. Mexico and 
Peru, for instance, have produced silver and copper for many centuries. The 
tin mines of Cornwall were operated from prehistoric times until quite re¬ 
cently. Even now, a century and a quarter after Jevons (1865) worried that 
Britain would run out of easily mined coal, his concern remains premature. If 
production in a particular mine or mineral province declines or ends, it rarely 
happens because the deposits are physically exhausted, but more probably 
because new discoveries have reduced extraction costs elsewhere or because 
substitutes have become available. Thus the recent expansion of gold output 
in North America and Australia has shown that, when the price is right, there 
is still plenty of “gold in them thar hills.” 

To sum up, Lovering's proposition appears obvious only until one looks at 
reality. Its contrary is more nearly true: for most countries, minerals, far from 
being important and ephemeral, are unimportant and enduring. This is 
confirmed by another observation: even mineral-rich countries typically have 
only a small fraction of their labor force employed in the mineral industries. 
In countries with average mineral endowment, such as the United States, the 
mineral industries account for no more than about 1 percent of employment. 
It follows that the pessimistic view greatly exaggerates the role of minerals in 
the economy. 

The authors avoid such conclusions: indeed they compound their accep¬ 
tance of Lovering’s superficial rhetoric by dismissing econometric (i.e., factu¬ 
ally oriented) research on minerals for having paid insufficient attention to 
the possibility of exhaustion. Their criticism would have been more persua- 
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sive if they had not ignored the theoretical work of Hotelling (1931) and his 
many followers, which is at the core of the “optimistic” view. The main mes¬ 
sage of that work is not to worry about exhaustion since it will be duly 
reflected in market prices. The authors’ inexplicable failure to refer to this 
substantial and directly relevant literature casts some doubt on their stated 
intention to bring economics and geology closer together. 

The debate between geologists and economists is approached in this book 
by analyzing projections from a detailed model of a particular mineral, 
namely copper. The choice of copper is motivated by a classification of miner¬ 
als into those that are a major component of common rocks (such as iron and 
aluminum) and those that are “geochemically scarce.” Minerals in the former 
category are called "geochemically abundant” or "superabundant” and are 
treated as inexhaustible for all practical purposes. Copper is the most widely 
used mineral in the second category: at present it is obtained from ores (as 
opposed to common rocks), but these are gradually being depleted. One day, 
conceivably, copper may have to be extracted from common rock, where it 
occurs only in relatively minute amounts. Extracting 1 ton of copper from 
common rock would require processing many thousands of tons of otherwise 
worthless material, compared to at most 200 tons when presently marginal 
ores are used. The extraction cost of a metallic mineral is roughly propor¬ 
tional to the volume of material that needs to be processed, but depends also 
on other factors. 

The recovery of a geochemically scarce mineral from common rock is de¬ 
scribed as the “backstop technology .” This term denotes a method of produc¬ 
tion that is always feasible but that is uneconomic when other methods of 
production are available. The backstop technology has two features that par¬ 
tially offset its necessarily high cost: since it does not require exploration, the 
royalty—equal to the expected cost of finding minerals—will be zero. Fur¬ 
thermore, the extraction cost (the only cost component left at that point) 
remains constant from then on. Paradoxical though it may seem, the normal 
progression of a mineral is from “geochemically scarce” to “geochemically 
abundant." In the case of copper, moreover, there exists a “mineralogical 
barrier" between ores and common rock, which means that no ores of very 
low grade are known to exist. A geochemical explanation of this barrier is 
presented. 

The discussion of geochemical scarcity and of the backstop technology is 
the most insightful part of this book. This discussion also disposes of the naive 
doomsday view that most minerals will actually be exhausted in some finite 
time span. The pessimistic view, whatever its other merits, is more sophis¬ 
ticated and does not predict exhaustion. The discussion actually comes to a 
slightly more optimistic conclusion than economic analysis of the Hotelling 
type, since the latter does not recognize the price ceiling set by the backstop 
technology. 

The copper model is designed to illustrate and, quantify the theory of 
ultimate geochemical abundance. It is formulate^ as a dynamic variant of 
linear programming, which permits an elegant treatment of the geological 
OMgitraints and the substitution in materials use that are the authors' principal 
doSsern. This elegance comes at a significant cost, however, since it makes 
ebftROmetric estimation and verification difficult and permits no more than 
perfunctory attention'TS final demand. It also implies, unrealistically, that 
leaner resources will not be taken into production until the richer ones are 
exhausted. Quadratic programming, for instance, would have been just as 
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degant but much less restrictive. These and other limitations aside, the theo¬ 
retical framework of this book (derived from Koopmans [1973]) appears to be 
equivalent to Hotelling’s approach as restated by Soiow (1974). 

The supply side of the model is introduced by an interesting discussion of 
the geological origins of copper deposits and their occurrence in the United 
States. The model requires estimates of copper resources of various grades 
and at various depths; these are taken mostly from one of the authors’ (Skin¬ 
ner's) past writings, as is the title of the book. The method used to obtain 
these estimates is volumetric analysis, in which the volume of rock of a given 
type is multiplied by the assumed average copper content of that type of rock. 
Assuming as it does that past exploration has provided us with random sam¬ 
ples from a lognormal distribution, volumetric analysis does not inspire great 
confidence since it is not based on any plausible theory of exploration. It also 
has to be supplemented by considerable guesswork, particularly for the 
leaner ores. In Skinner’s hands, however, it is probably the best that economic 
geology has to offer. Its principal limitation in this book is geographic; it is 
applied only to the United States, excluding Alaska and Puerto Rico. The 
implication of autarky is weakly defended with the argument that as the 
United States runs out of copper ores, so will the rest of the world. 

In addition to estimating the amount of copper by grade and depth of 
deposit, the book uses figures for the associated extraction costs to come up 
with a supply function. This is presented as a “mqjor innovation” (p. 52), but 
there is not enough detail to clarify how it was done; neither the form of the 
cost function nor the method of estimation is stated. However, there is some 
discussion of the probable cost of extracting copper from common rock, a 
process that is remote from historical experience but essential to the authors’ 
theory of ultimate geochemical abundance. 

The demand side of the model also relies on prior work by one of the 
authors (in this case Gordon). It focuses on substitution between copper and 
other materials, chiefly aluminum. The demand for all copper-containing 
products is assumed to have an income elasticity of one and a price elasticity 
of zero. These are the customary doomsday assumptions, which would virtu¬ 
ally guarantee disaster if they were at all realistic. In this case, actually, they 
are made necessary by the linear programming framework and by the deci¬ 
sion to analyze substitution on a product rather than an industry basis. The 
latter decision makes it possible to use engineering information but very hard 
to use published data on consumption and thus to verify the results. While 
one may question that decision, it should be said that the analysis of substitu¬ 
tion is quite careful in other respects and that recycling also receives due 
attention. 

The copper model is applied to a period of 180 years starting in 1970. In 
the “base case,” gross national product (to which the derived demand for 
copper is proportional by assumption) is postulated to grow at 3 percent per 
year through 2070 and at 1 percent per year thereafter. The annual discount 
rate, adjusted for inflation, is 8 percent through 2070 and 4 percent thereaf¬ 
ter. The book provides little if any justification for these assumptions, relying 
instead on sensitivity analysis to show the effect of changing parameters. The 
growth and discount assumptions change in 2070 because that is when the 
backstop resource is taken into production (in the base case). 

The real price of copper in the decade 2060—69 is projected to be about 50 
times what it was in the decade 1970-79, equivalent to an average increase of 
4.5 percent per year. The “scarcity rent” (also called “royalty” and corre- 
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sponding to Hotelling's “net price") increases at 8 percent per year between 
these two decades, just as it would in a Hotelling model with the same dis¬ 
count rate and without technological change. After 2070, as already men¬ 
tioned, the royalty drops to zero. 

According to the base case projections, copper is replaced by other materi¬ 
als in most of the products in which it is now an important ingredient. Even 
so, copper mines in the United States produce more than six times as much in 
2060-69 as they did in 1970-79. According to the model, the extraction of 
this large amount at a very high unit cost has a marked effect on the growth of 
real GNP by withdrawing labor and capital from other uses. Over the entire 
projection period of 180 years, growth is reduced (in the base case) by 0.4 
percent per year compared to what is projected with a constant real price of 
copper. This projection is somewhere between the optimistic and the pessi¬ 
mistic view, but closer to the former. 

How much faith can we put in these projections? Although the supply and 
demand modules of the copper model have substantial empirical ingredients, 
the model as a whole is not fully empirical and is not tested against historical 
data. At one point (p. 122) the authors seem to recognize the need for empir¬ 
ical verification, but the subject is not pursued. The implication is that fore¬ 
casts obtained from the model do not have a well-defined error distribution. 
The sensitivity analyses are of limited usefulness in assessing the reliability of 
the projections because some of the most questionable assumptions (such as 
the zero price elasticity of final demand) are not covered. 

This does not mean that the sensitivity analyses are without interest. In fact 
they range widely, yet are presented in a commendably concise fashion. Only 
one of them calls for comment: “The single most important sensitivity run 
considers the possibility of future technological change in copper and related 
industries" (p. 123). In the base case, productivity is assumed to grow at the 
same (unstated) rate in all industries. If productivity in copper grew 2 percent 
more than in other industries, the price in 2070-79 would be 87 percent 
below the base case projection; if it grew 4 percent more, the price would 
actually be lower in that decade than it was in 1970—79. Evidently the produc¬ 
tivity differential is an influential parameter, The book cites Kendrick (1961), 
who found that productivity in metal mining grew slightly more between 
1889 and 1953 than in the economy as a whole. It appears, however, that 
since World War II productivity has grown less in metal mining than else¬ 
where (Houthakker 1979; Jorgenson, Gollop, and Fraumeni 1987). Metal 
mining, of course, includes much besides copper, but productivity differen¬ 
tials may be less important than the book suggests. 

This raises the key question about the copper model: Can it explain why, as 
mentioned earlier, the real price of copper has been trendless for more than a 
century? There is little reason to think it can; presumably some important 
causal factor is absent from the model, or some constraint is seriously wrong. 
If so, its projection of a strong upward trend between 1970 and 2070 is 
merely a possibility that tells us more about the model than about the future. 
The model (or some simplified version suitable for econometric testing) needs 
revision, but some of its components will no doubt be useful to other re¬ 
searchers. 

By way of a conclusion, I return to the authors’ hope of resolving the 
debate between economists and geologists. They have indeed made an impor¬ 
tant step in this direction, not least by fairly stating the issues anid coming to a 
consensus among themselves. Their clearly written and well-organized book 
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deserves to be widely but critically read. Its main weakness is insufficient 
attention to historical experience, and I hope that others will be stimulated to 
clarify past developments in order to foresee the likely future with greater 
confidence. 


Hendrik S. Houthakxer 

Harvard University 
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Melanges d'economie politique cl suciale. By Leon Walras. 

Paris: Economica, 1987. Pp. 573. 

For those familiar with the niceties of Elements of Pure Economics (1874), 
Melanges d’economie politique el sociale will appear as it is: a bizarre book. 
Though intended by Walras to be published as a book on its own, it is actually 
a set of collected papers, written over 50 years. Miscellanea, without any 
doubt. As for the explicit purpose of the book, that is, to develop social 
economics, I must confess and shall justify my skepticism. 

1. The book under review is part of a series of 14 books, intended to be the 
complete economic works of the Walrases, father and son. The editors, 
affiliated with the Centre Auguste et Leon Walras (Lyon, France), must be 
congratulated for the job they did in preparing this volume. It is based on a 
project elaborated by Walras from 1894 to 1909 and revised by his daughter 
Aline and his disciple Etienne Antonelli in 1923, but never published before. 
A general introduction followed by short presentations to each “chapter” 
allow the reader to learn when the different papers and essays were written 
and/or published and what logic Walras intended to exhibit in grouping 
them. 

In establishing footnotes, the editors successfully avoided the “too many/ 
too few" biases: too many footnotes, so that only the experts would look at it; 
too few so that they would not be helpful. Here, we find all that we need, and 
only that. Different versions, when they existed, were added to the manu- 
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scripts for comparison. Previously unpublished elements have been added 
that contain significant information, such as the mathematical note in Wal¬ 
ras’s review of Cournot that the publisher suppressed in 1863 (see essay 9). 
There are also exact references to books, papers, and so forth quoted, or 
alluded to, by Walras. There are precise indications on paragraphs that were 
published in previous works by Walras since he was adept at self-quotation. 
The book also contains an index of authors with, for each entry, a short note 
on who an author was or what he did that may be pertinent for understand¬ 
ing why Walras referred to him. There is no thematic index so far. But 
volume 14, listed as Tables and Index, may compensate for that, when available. 

Last, but not least, the printing is of high standard. This is a pleasant book 
to read and to have on one’s bookshelves. Typefaces are well chosen and allow 
the reader to clearly distinguish among introductions by the editors, the main 
body of the text, quotations in Walras’s prose, and addenda to previously 
published papers (see, e.g,, essay 12). 

2. Now what about the content? 

From a theoretical perspective, 1 must confess that I found the book basi¬ 
cally without interest. If Walras achieved a breakthrough in pure economics, 
he never succeeded in doing so in social economics. Here, we have the record 
of this failure. On most fundamental issues, the book is assertive, rather than 
demonstrative, and full of repetitions. Walras wanted to convince through 
reassertion, and he devoted plenty of pages to self-congratulation. 

The major ideas are those developed in Elements, with some specific insights 
that I would like to discuss briefly because they clarify Walras’s research 
program. Walras repeatedly exposed the idea that free competition among 
individuals and free trade among countries are one and the same thing: 
rooted in maximizing behavior, and the source of mutual benefits. But from 
1860 to 1909, he progressively accentuated what he perceived as limits to free 
competition. These limits are of two types: some are internal to the function¬ 
ing of markets and related to situations in which pure competition cannot 
exist, as for the railroads (see essays 14 and 17); others are embedded in 
radical inefficiencies of markets, such as the lack of redistributive mechanism 
(see essays 17, 21, 27, and 28). Solutions are to be found in the active role of 
governments as sole representatives of collective interest. In this is rooted 
what Walras called his “scientific socialism”: collective interests should be 
satisfied by governments through the prodution of public goods, which must 
b c freely available (see essay 3), and through the redistribution of wealth, so as 
to maintain justice (essay 28). But how will governments finance these activi¬ 
ties? Taxes would introduce biases in free competition as well as in free trade. 
Walras’s answer to this problem is collective ownership of land (and of natural 
resources?). Land not being a service, there are no property rights to it for 
individuals. The state should be the sole landowner, and lands rented to 
agricultural producers should be the main source of public revenues (see 
essays 8 and 27). 

3. In arguing so, W’alras clearly shifted to a utopian view, which is of 
interest primarily to historians of economic thought. Reading the Melanges, 
we get a much clearer picture of the architecture of economics as represented 
by Walras. All his contributions were motivated by a basic conviction, born in 
the 1860s (see essays 1, 4, and 8), that there are two branches in economics: 
pure economics, which theory of value in exchange and of social wealth 
and is a “psychic-mathematic” science; and applied economics, in which prob¬ 
lem of property and taxes (i.e., government) would be analyzed, and which is 
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a moral science. For Walras, both were equally important. r Thts may explain 
why most of his work focuses on problems of applied economics (see essay 17 
for a synthesis) and on the role of economic institutions (essays 5, 6, 7, 11, 12, 
etc.). 

In developing these ideas, Walras went far from pure economics. In the 
Melanges, he appears mainly as a polemicist, opposed to classical economists, 
whom he labeled as “individualists" and “utilitarists" (a radical critique in 
Walras’s mind: see essays 19, 20, and 27), and against dogmatic socialists 
(essay 14). He thought that there were two fundamental social values: liberty, 
based on the rights of individuals, and equality, which makes room for the 
rights of collectivity (essay 20). Though he shared many prejudices of his time 
(see his judgment on Algerians in essay 11), he provocatively and convinc¬ 
ingly presented himself as a socialist of a new type: scientific, liberal, and 
humanistic (essay 28). 

As a polemicist, Walras often appears unfair to his precursors, contradic¬ 
tors, or contemporaries (on Cournot, essays 9 and 26; on Chevalier, essay 23; 
on Bastiat, essay 25; on Jevons, essays 17 and 23; etc.). Most of his critiques, 
from his dialogue of 1860 in which he is the scientist (essay 1) to his hommage to 
his protector Ruchonnet in 1909 (essay 28), were intended to establish the 
greatness of his own contribution, very often to a degree of caricature (as in 
essay 23). 

This is also what Walras was: a self-imbued man, sharing the dreams of the 
early nineteenth-century reformers who thought that the sciences were al¬ 
most achieved and that future tasks were related to their applications. Walras 
was persuaded that he achieved the foundations, that what was needed after 
the Elements was applications. Fortunately for us, he was wrong. 

Claude Menard 

University of Paris (Panlheon-Sorbonne) 
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Many have argued that if labor income is difference stationary, the 
permanent income hypothesis predicts that consumption should be 
relatively volatile. In U.S. aggregate data, labor income is well char¬ 
acterized as having a unit root; however, consumption turns out to 
be relatively smooth. This anomaly is known as Deaton’s paradox. I 
resolve Deaton’s paradox by providing decompositions of labor in¬ 
come into permanent and transitory components. These preserve 
the univariate dynamic properties of labor income. However, when 
agents distinguish permanent and transitory movements in their 
labor income—as the rational expectations hypothesis asserts they 
should-r-the permanent income hypothesis correctly predicts the ob¬ 
served smoothness in consumption. 


I. Introduction 

Milton Friedman’s permanent income theory of consumption is one 
of the outstanding successes of dynamic economic reasoning. This 
theory asserts that consumption occurs out of permanent income, not 
current income. Permanent income is related to but distinct from 
current observed income. Under the intuition that permanent in- 
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come—because it is “permanent”—should be smoother than current 
income, the theory has long been understood to predict that con¬ 
sumption should be smooth relative to fluctuations in observed in¬ 
come. This relative smoothness of consumption is a firmly established 
empirical regularity in aggregate time-series data. 

Deaton (1987) observed, however, that the permanent income hy¬ 
pothesis fails to generate this smoothness if labor income is an inte¬ 
grated process, that is, if labor income has a unit root. According to 
Deaton, a unit root characterization for labor income, given the data, 
implies that observed consumption is insufficiently sensitive to inno¬ 
vations in current income. He concluded that if labor income is well 
characterized as being integrated, then “the representative agent ver¬ 
sion of the permanent-income hypothesis can be rejected because it 
fails to predict the fact that consumption is smooth, the very fact that 
it was invented to explain in the first place” (p. 122). This anomaly in 
the joint behavior of consumption and income has come to be known 
as “Deaton’s paradox.” 

Deaton’s analysis, therefore, appears to argue strongly for the need 
to establish whether labor income truly is an integrated process. His 
work has suggested that the predictions of an important economic 
theory—Friedman’s permanent income hypothesis—are intimately 
related to measures of long-run persistence, such as that considered 
by Campbell and Mankiw (1987) and Cochrane (1988).’ 

This paper offers a simple and intuitive explanation for this 
smoothness in consumption. There are different kinds of distur¬ 
bances that impinge on the labor income stream. Some disturbances 
have permanent effects on labor income; other disturbances have 
only a transitory impact. I show below that the permanent income 
hypothesis under rational expectations—not surprisingly—implies 
that different kinds of disturbances have different effects on con¬ 
sumption. Disturbances that do not have permanent effects on labor 
income will not have large effects on Friedman’s permanent income. 
These disturbances will therefore have only a relatively small impact 
on consumption. On the other hand, disturbances that do have a 
permanent impact on labor income will have relatively large effects 
on consumption. 1 2 

1 It is, of course. Nelson and Plosser (1982) who have forcefully drawn macro- 
economists’ attention to the fact that many aggregate time series may be difference 
stationary rather than trend stationary. In Nelson and Plosser's terminology, a “differ¬ 
ence-stationary" series has ajirst difference that is covariance stationary, although the 
series itself is not; a “trend-stationary" series is covariance stationary about a determin¬ 
istic time trend. In econometric terminology, therefore, a difference-stationary series is 
integrated of order one, or simply integrated, when the order can be omitted without 
ambiguity. 

2 This characterization is explicidy derived from an optimizing equilibrium model 
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Therefore, according to the analysis here, the permanent income 
hypothesis prediction for the smoothness properties of consumption 
depends on the relative importance of permanent and transitory 
components in labor income. The univariate dynamics of labor in¬ 
come—whether or not labor income is integrated, how “persistent” 
labor income is, or what the precise form of the univariate dynamics 
is—turn out to be not especially informative for the predictions of the 
theory. 

It remains controversial whether macroeconomic time series are 
better characterized as integrated or as stationary about a determin¬ 
istic trend. This paper does not attempt to shed light on that issue. 
Instead it argues only that, at least within the context of the perma¬ 
nent income hypothesis, a unit root characterization for labor income 
may not have implications that are as dramatic as has previously been 
suggested. Further, and again at least within the context of the per¬ 
manent income hypothesis, the widely noted measures of persistence 
in Campbell and Mankiw (1987) and Cochrane (1988) may simply be 
beside the point. 3 

The remainder of this paper is organized as follows. Section II 
briefly reviews other explanations of “excess smoothness” that have 
been offered and makes explicit the difference between those and the 
reasoning in this paper. Section III sets out the standard permanent 
income model and makes rigorous the intuition above. Section IV 
provides decompositions of labor income into permanent and transi¬ 
tory components that reconcile (1) the observed smoothness in aggre¬ 
gate consumption, (2) the estimated univariate dynamics of labor in¬ 
come maintaining a unit root characterization, and (3) the permanent 
income hypothesis. 4 My explanation for apparent “excess smooth¬ 
ness” in consumption turns on the plausible assumption that eco- 


below. Thus it should be distinguished from the one, such as in Slock (1988), in which 
consumers ignore transitory income altogether. Further, the coefficient on permanent 
income in the consumption equation in that work is treated as a free parameter. By 
contrast, in the kind of models considered here, that coefficient is intimately related to 
labor income dynamics. Stock expertly applies recent developments in the theory of 
regression with coinlegrated variables to reestablish Friedman’s assertions about errors- 
in-variables bias. Deaton’s paradox does not arise in those kinds of models. The ques¬ 
tions of interest there are different from those considered here. 

3 Recent papers that have provided arguments similar in spirit to that here are 
Blanchard and Quah (1989) and Christiano and Eichenbaum (1989). The former docs 
so in a Keynesian model with sticky nominal wages, while the latter makes the point in 
discussing productivity disturbances in a real business cycle model. 

4 Note that I am not suggesting that this permanent-transitory decomposition will 
explain the other anomalies in the predictions of the permanent income model. For 
instance, it is now well known that the martingale implication for consumption is simply 
false in aggregate time-series data. See, among others, Christiano, Eichenbaum, and 
Marshall (1987), Nelson (1987), Caballero (1988a, 1988ft), and Heaton (1988). 
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nomic agents forecast future labor income using strictly more infor¬ 
mation than the econometrician does. In many rational expectations 
models, the econometrician can take into account this superior infor¬ 
mation of private agents by using the endogenous variables of the 
model in the modeler’s forecasting equations. Section V shows the 
permanent income model of consumption to be a counterexample to 
the validity of this methodology: A researcher studying the observed 
behavior of the model variables would conclude that consumption is 
unresponsive to “news” in labor income, even if the permanent in¬ 
come hypothesis were to be true. 5 Section VI concludes the paper. 


II. Related Literature 

Deaton’s paradox has generated an extensive literature. For reasons 
of space, I shall discuss only a small fraction of the relevant work: 
Christiano (1987) and Diebold and Rudebusch (1989) provide more 
extended discussions on the literature. 

Christiano observes that movements in labor income may be related 
to interest rate fluctuations. To the extent that savings are sensitive to 
interest rate movements, equilibrium changes in consumption will be 
dampened by appropriate comovements in income and the interest 
rate. Thus, conditional on a given pattern of labor income dynamics, 
an equilibrium theory might, in principle, predict consumption to be 
less volatile than implied by a model with a constant interest rate. 
Christiano therefore studies a simple general equilibrium real busi¬ 
ness cycle model that allows the interest rate to vary over tim». By 
appropriately setting parameter values, he is able to match the ob¬ 
served volatility of changes in consumption. Christiano points out, 
however, that when he does this, the model is unable to replicate the 
actual dynamic behavior of income in the U.S. economy. 

Caballero (1988a) modifies the preferences of the infinite-lived rep¬ 
resentative consumer to allow “taste shocks” and to incorporate 
an explicit “precautionary savings” motive. Clarida (1988) and Gali 
(1989) consider the aggregation problem in infinite-lived model econ¬ 
omies that are populated by finite-iived consumers. These modifica¬ 
tions partially succeed in reconciling the predictions of the theory 
with the data. They all suggest that even in the presence of a unit root 
in labor income, equilibrium theory predicts that consumption may 


* After I had completed a first draft of this paper, Lars Hansen, John Heaton, and 
Thomas Sargent pointed out to me that Hansen, Roberds, and Sargent1989) contains 
analogous results. 
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be relatively smooth. Note, however, that none of these proposals 
quite confronts the challenge that Deaton posed. 6 

This same comment applies to that class of explanations that sug¬ 
gest that labor income may in fact not be difference stationary, or that 
even if it were difference stationary, the usual estimates of long-run 
persistence may simply be “too large." Diebold and Rudebusch (1989) 
suggested a fractional integration model for labor income; Watson 
(1986) used an unobserved-components model. In U.S. aggregate 
data, these alternative parameterizations of the Wold moving average 
representation imply point estimates for long-run persistence smaller 
than those in Deaton (1987) or West (1988). Again, these suggestions 
partially succeed in reconciling the optimizing theory with the data. 
As Cochrane (1988) does, these papers properly warn that estimates 
of long-run persistence may be quite sensitive to specification. How¬ 
ever, according to the analysis developed below, the estimates oflong- 
run persistence are simply not especially relevant. 

The conclusions of this paper assume that the researcher has 
strictly less information than agents. In many rational expectations 
applications, this is not important since the model variables will reveal 
all relevant information. This insight underlies the many Euler equa¬ 
tion-type tests of market efficiency and equilibrium models. In the 
current setting, however, when there is more than one disturbance 
affecting labor income, the permanent income hypothesis also pre¬ 
dicts that the model variables, consumption and income, cannot ap¬ 
propriately reveal the true effects on consumption of “news” in labor 
income. In fact, an example below shows that an econometrician 
studying the joint dynamics of consumption and income will conclude 
that consumption seems not to respond to certain news in labor in¬ 
come, even when the permanent income hypothesis is true. Thus the 
econometrician will conclude that consumption appears to be “exces¬ 
sively smooth,” even though in truth it is not. Note, however, that this 
does not explain the rejections of the permanent income hypothesis 
in West (1988) and Campbell and Deaton (1989) since those research¬ 
ers employed information on asset holdings in addition to consump¬ 
tion and income. Following the reasoning in Campbell and Deaton, 
that rejection must therefore arise from violation of the usual cross¬ 
equation restrictions, and not from excess smoothness per se. 

Flavin (1988) has criticized the work by West and Campbell and 
Deaton from a different perspective. Her model departs from the 
permanent income theory; under the hypothesis in that work, con- 

B See the quote from Deaton in the Introduction of this paper. Their models, how¬ 
ever, also imply that consumption is not a martingale. 
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sumption and savings will not contain all relevant information. By 
contrast, I argue here that the main force of Flavin’s conclusion holds 
even under the permanent income hypothesis. 

III. The Model 

Hansen (1987) and Sargent (1989) have provided dynamic general 
equilibrium interpretations of the permanent income hypothesis 
(PIH). The specification follows that in Hall (1978) and Flavin (1981) 
and comprises the following three equations: 

C(t) = rW(t), (1) 

Xi 

W) = m + f(l + r)-' X (1 + r)~JE,Y(t + »1, (2) 

L 7 = 0 J 

K(t + 1) = (1 + r)K(t) + Y(t) - C(t). (3) 

Equation (1) states that consumption in each period equals perma¬ 
nent income. This is simply the flow of rental income from total 
wealth W, accruing at the time-invariant equilibrium risk-free interest 
rate r. In equation (2), total wealth is the sum of physical capital K and 
human wealth. Human wealth, in turn, is the expected present dis¬ 
counted value of the stream of labor income Y. As usual in this litera¬ 
ture, I assume that the labor income stream is exogenous with respect 
to the agent’s consumption decision. However, total income—the sum 
of labor and capital income—obviously depends on the agent’s past 
consumption decisions. In summary, the agent consumes the- re¬ 
source stream that flows from renting out, in perfect markets, all her 
physical and human capital. Equation (3) simply defines capital stock 
transition: capital does not depreciate and accumulates through 
agents’ savings. 

These equations can be combined to obtain 
AC(<) = C(t) - C(t - 1) 

x 

= • X o + + j) - £«-1 y(t + j)]. 

7 7 = 0 

Define |3 to equal (1 + r)~ l . Then 

x 

AC(0 = (1 -_^) X + j) - E t - + ;)]. (4) 

, = o 

As numerous authors have emphasized, the right-hand side of (4) is 
the annuity value of revisions in the expected labor income stream; 
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these revisions are due to new information arriving in period /. The 
model predicts that the larger the impact of news on human wealth is, 
the larger should be the change in consumption. 

Equation (4) yields two potentially refutable predictions. First, 
given a particular data-generating process for labor income, the mag¬ 
nitude of consumption’s response to news can be calculated from (4). 
Second, information available prior to an arbitrary time period f 
should neither affect nor help to predict the change in consumption 
at t\ that is, consumption should be a martingale with respect to 
agents’ information. This is simply Hall’s (1978) famous characteriza¬ 
tion of consumption under the PIH. Contradiction of these implica¬ 
tions is referred to as “excess smoothness” and "excess sensitivity,’’ 
respectively. 

In this model, Hall’s martingale characterization is clearly indepen¬ 
dent of the exact process that generates labor income. However, the 
appropriate statistical theory for inference should, of course, depend 
on the properties of the instruments used to examine the martingale 
restriction. But this will always be true in any econometric procedure 
and is not particularly special to the PIH. 

The smoothness predictions, however, depend critically on the 
model generating labor income. That model is what defines news, 
which, in turn, affects consumption through (4). To see this explicitly, 
I briefly summarize Deaton’s (1987) excess smoothness argument. 

First, suppose that labor income Y is a trend-stationary process. We 
can, without loss, take the trend to be identically zero since here we 
are interested only in the second-moment properties of consumption 
and income. If Y has finite time-invariant second moments, it is 
guaranteed to have a unique Wold representation: 


Y(t) = £ - k) = B(L)r\(t), 

A“0 

def 

where b( 0) = 1, the function B{z) = Z*=o b(k)z k ^ 0 for all z on the 
dosed unit disk, L denotes the lag operator, and T| is serially uncor¬ 
related. Suppose that the representative agent uses only current and 
lagged labor income observations to forecast future labor income. A 
result due to Hansen and Sargent (1980, app. A) then implies a sim¬ 
ple formula for human wealth: 7 

£ IWOT + fl - f!MU 

j-o L L ~ P J 


7 l assume throughout that expectations coincide with linear projections. 
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Further, by a property of iterated expectations, 


00 


I 

;-0 


3 %-M+j) 




X VE,Y(t + j) 




= 


LB(L) - 3/1(3) 
L - 3 


“n(0 


ijm - Pfi(3) 
L- 3 


L 


1 /-n(0. 


where [ ] + denotes the annihilation operator. 8 When we use these in 
(4), the resulting change in consumption is 


AC(0 = (1 - 3) 


LB(L) - 3/1(3) 

LB(L) - 3/1(3) , - .1 

L 

L - 3 

L l - 3 j 

4 


■n(0- 


This simplifies to 

AC(0 = (1 - 3)B(3)ti(/). 


(5) 


Thus in any given period, the change in consumption depends on (1) 
the interest rate through 3, (2) the dynamics B of labor income, and 
(3) the innovation t) in labor income. Given 3 and B, the change in 
consumption is proportional to news in labor income. Since 3 is close 
to one for small values of the interest rate, other things equal, changes 
in consumption should be relatively “small.” 

Next, suppose that labor income is difference stationary. As in the 
treatment of the trend-stationary case, we shall ignore possible drift in 
labor income since that cannot affect the second-moment properties 
of consumption and income. Denote changes in Y by AK. Under the 
assumption that the process A Y has finite time-invariant second mo¬ 
ments, it necessarily has a unique Wold representation: 

30 

AK(t) = ^ a(k)t(t - k) = A(L)t(t), 

k *= 0 

del 

where «(0) = 1, the function 4(z) = X*=_-o a(k)z k ¥= 0 for |z| ^ 1, and e is 
serially uncorrelated. If agents use only current and lagged values of 
labor income to forecast future labor income—as Deaton (1987) as¬ 
sumed—we can again use the Hansen-Sargent analysis to obtain 9 


8 Loosely speaking, the annihilation operator modifies us Operand by removing the 
part in strictly negative powers of L in the operand's Laurent series expansion (see 
Hansen and Sargent 1980). 

9 Here we need to calculate the expected present discounted value of a process Y that 
is not stationary. Notice that the resulting expression contains a singularity on the unit 
circle. However, the present discounted value turns out, nevertheless, to be well 
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£ 3 >E t Y(t + j) = 
j-o 



|A(3K1 - L) 

() ll-pj 

{L - 3)0 

- L > 


e(t). 


As before, a property of iterated expectations implies that 


£ 3 J E'-iY(t +j) = 


/-« 


LA(L) - ( 1 -&-p)AO)(l - L) 

(I - 3)(1 - L) L 


U(l). 


By the same reasoning as above, changes in consumption follow 

AC(t) = (1 - 3) • (I - 3r'd(3)e« = A(3)e«). (6) 

Equation (6) is the analogue to (5) when labor income is difference 
stationary rather than trend stationary. Comparing these two equa¬ 
tions, notice that (6) does not contain the term 1 - 3- this is why, 
other things equal, the PIH under difference stationarity implies a 
relatively more volatile consumption series. 

When A and var(e) are estimated on U.S. aggregate time-series 
data, the implied variance of the right-hand side of (6) is significantly 
larger than the sample variance of changes in consumption. 10 From 
this evidence, Deaton concluded that aggregate consumption is exces¬ 
sively smooth if labor income is characterized as an integrated pro¬ 
cess. 

In U.S. aggregate data, labor income appears to be well described 
as being integrated. 11 However, there is certainly no compelling evi¬ 
dence that agents in the economy estimate human wealth using only 
their labor income history. For instance, suppose that there are two 
kinds of structural disturbances to labor income. One class of distur¬ 
bances has a permanent impact on the level of labor income; the other 
disturbances have only a transitory impact. For simplicity, we can 
suppose that there are only two structural disturbances in the econ¬ 
omy, one in each class. Allowing a more general specification does not 
alter the conclusions of interest here, although typically the distur¬ 
bances will not aggregate naturally into the one permanent and one 
transitory component (see the technical appendix in Blanchard and 
Quah [1989]). 


defined, by the reasoning surrounding eq. (A3) of app. A in Hansen and Sargent 
(1980). 

10 See Deaton (1987) and, among others. West (1988) and Campbell and Deaton 
(1989). This result is remarkably robust across alternative specifications for A (sec West 
1988; Diebold and Rudebusch 1989). 

11 This unit root characterization will be maintained in the subsequent analysis since 
excess smoothness arises only in this case. 
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Under rational expectations, agents estimate their human wealth 
using all available information on the diff erent kinds of disturbances. 
In particular, if it will improve their forecasts of future labor income, 
they will use the information that there is a permanent and a transi¬ 
tory component in labor income. The innovations in the different 
structural disturbances may be correlated; however, one can con¬ 
struct—in the natural way and without loss of generality—an orthog¬ 
onal decomposition to use in forecasting future labor income. 12 

Therefore, suppose that we can write Y(t) = F](() + F o (0> where hi 
is difference stationary and ho is covariance stationary. Assuming that 
Ah] and h 0 have finite time-invariant second moments, we can write 
their Wold decompositions as 

x 

AY,(() = a,(k)t,(l - k) = A|(L)e,(0 

A- 0 


and 

Y o(0 = ^ ao(k)f.o(t — k) = Ao(L)eo(0. 

*»o 

where the innovations ei and e 0 are uncorrelated at all leads and 
lags. 13 For brevity, I shall refer to h| and h ( > as the permanent and 
transitory components in labor income, respectively. The permanent 
component in labor income should not be confused with permanent 
income, which is precisely defined from the equations of the PIH. 

By exactly the same reasoning as in the cases previously considered, 
equilibrium consumption follows 

AC«) = A,(p)e,W + (1 - p)A„(3)eoM. (7) 

Equation (7) shows that the consumption response depends, in gen¬ 
eral, on the kind of news that dominates in any given period. For P 


12 Quah (1989) shows how to construct such a decomposition in which one compo¬ 
nent is integrated, the other is stationary, and the innovations in the two components 
are uncorrelated at all leads and lags. That paper also proves that such an orthogonal 
decomposition can always be found. This is unlike the orthogonal decomposition in 
Watson (1986), which, under some circumstances, may not exist. It is, however, a 
maintained assumption that in the economy there are different structural disturbances, 
not perfectly correlated, that have permanent and transitory effects on labor income. 
Finally, note that in the current context, the Beveridge-Nelson (1981) decomposition is 
not an interesting one to consider: When the two components are perfectly correlated, 
forecasts of future labor ijjcome are invariant to whether one uses the Beveridge- 
Nelson decomposition or the univariate Wold representation. 

lJ From n. 12, this orthogonality assumption is without loss since the representation 
is to be used only in forecasting future labor income. Since the structural disturbances 
to labor income may be correlated, V, and K 0 may not be directly interpretable. A 
moment's reflection shows that this does not affect forecasts of future Y and therefore 
does not change our predictions for consumption behavior. 
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close to unity, news that has only transitory effects will have only a 
relatively small impact on consumption; the opposite is true of news 
that turns out to have permanent effects. Thus the theory predicts 
that consumption volatility depends on the relative importance of 
permanent and transitory components in labor income. 

To see explicitly the implications for consumption volatility, we 
need the Wold lag distributions A \ and A () , as well as the innovation 
variances var(ti) and var{c 0 ). While the components Y\ and Y 0 are 
guaranteed to sum to the observed labor income process Y, their 
innovations «i and Co bear no simple relation to the innovation e in Y. 
Nor is there a simple relation between the lag distributions A\, do, 
and A: in particular, it is not true that A j(z) + (1 - z)A 0 (z) equals 
A(z). 

How then is such a decomposition into permanent and transitory 
components consistent with the time-series observations on aggregate 
labor income? It is clearly necessary that the spectral densities of A Y j 
and AT) sum pointwise to equal the spectral density of A Y. Thus for 
all a) in (- it, it], we have 

var(c)|d(c~'“)| 2 = var(e,)|d,(« _ “°)| 2 + var(e 0 )|l - e~ ,u ‘\ l ‘ i \A 0 (e~ ,w )\‘ i . 

Under weak regularity conditions, this relation across the spectral 
densities is not only necessary but also sufficient to characterize the 
orthogonal decomposition (see Quah 1989). In the subsequent discus¬ 
sion, we can therefore focus only on this pointwise equality in the 
spectral densities. Making the natural definitions, we can write this as 

S( cu) = S,(w) + |1 - f- ,u ’| 2 S 0 (w). (8) 

This equation has two important features that we shall use repeatedly 
below. First, since the second term on the right-hand side is nonnega¬ 
tive, Si must be everywhere bounded from above by S. Next, Si must 
equal S at to = 0 since the second term on the right-hand side vanishes 
there. Thus Si((o) 2 S(to) for all o>, with strict equality at ca = 0. In 
words, the spectral density of changes in observed labor income forms 
an outer envelope for that in its permanent component, where that 
outer envelope is binding at frequency zero. Put another way, the 
spectral densities of changes in the permanent and transitory compo¬ 
nents are a cleaving of that in observed labor income. 14 Figure 1 illus¬ 
trates such a cleaving of a spectral density. 

By the equality at frequency zero of S and Sj, agents’ forecasts of 
the long-run effects of a disturbance are always the same, regardless 
of whether agents view the disturbance as one in the permanent com¬ 
ponent or as one in observed labor income itself. 15 The relative im- 

11 Larry Christiano suggested this terminology. 

15 Further, this long-run invariance can be shown to hold even when the permanent 
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portance of the permanent and transitory components is altered as we 
vary the cleaving of the spectral density. Across all cleavings, how¬ 
ever, the measure of long-run persistence always remains the same. 
Equation (7) suggests then that as long as p is strictly less than one, the 
volatility of consumption—which appears to vary with the relative 
importance of the permanent and transitory components—is not de¬ 
termined in any essential way by the magnitude of long-run persis¬ 
tence. In the next section, we shall use the spectral density characteri¬ 
zation to show this rigorously. 


IV. Explaining “Excess Smoothness” 

A first-order autoregressive model for the first differences of U.S. 
aggregate labor income yields point estimates of 0.44 for the autore¬ 
gressive coefficient and 636.1 for the innovation variance. If the risk¬ 
free interest rate r is taken to be 1 percent per quarter, then equation 
(6) implies that the variance of consumption changes should be 1,997; 
the actual observed sample variance is only 246. 16 Tables 1 and 2 in 
West (1988) display different autoregressive, integrated, moving av- 


and transitory components are correlated. See Cochrane (1988) for the case in which 
the permanent component is restricted to be a random walk, and Quah (1989) for the 
general case. 

18 These numbers are for the Blinder-Deaton (1985) data, which are those typically 
used in studies on consumption volatility. It is evident that alternative “reasonable" 
values of r do not fundamentally narrow this difference between predicted and actual 
variances. Properly accounting for the sampling properties of these estimates also does 
not alter the conclusion that consumption appears too smooth compared with the 
predictions of the model (see West 1988). 
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erage (ARIMA) parameterizations for labor income that all show the 
same conclusion. Consumption appears to be "excessively smooth,” 
given the maintained assumption that labor income is an integrated 
process. 

Now suppose that agents use information on permanent and transi¬ 
tory movements in labor income to estimate their human wealth. 
Recall that consumption should then behave as 

A C(t) = + (1 - |J)Ao( 0)€ o «), 

which implies that the variance of consumption changes is 

. var(AC) = Ai(p) 2 ■ var(e,) + (1 - 0) 2 A O (P) 2 • var(e 0 ). 


Clearly, the univariate Wold characterization of labor income does 
not directly restrict the smoothness properties of consumption. 

A simple example will build intuition for the calculations to follow, 
although the example is not completely successful in explaining ex¬ 
cess smoothness. Suppose that the permanent component Y\ in labor 
income is described by A|(z) = (1 - yz)~ ', with | -y | < 1; that is, the 
permanent component is a stationary first-order autoregressive pro¬ 
cess in first differences. Assume that the correct model for observed 
labor income is the first-order autoregression in first differences 
above. For the “outer envelope" condition described above to hold, 
we must have (1) y > 0.44 and (2) var(e,) = (1 - y) 2 x (1 - 0.44) _ 2 
x 636.1. Condition 1 guarantees that the spectral density S of AF 
dominates that of AF 1 ; condition 2 restricts these spectral densities to 
be equal at frequency zero. It follows then that AF - AFj is the first 
difference of a process that is covariance stationary. 1 ' 

The pointwise equality of the spectral density sum (8) then allows 
derivation of the dynamics in F 0 . For all z, we have 

var(e„) • (1 - z)( 1 - z~ VoW'M 2 " *) 

= var(c) • A(z)A(z -1 ) - varfe,) • A,(z)A,(z~‘) 


= var(c) 


\L 

1 V 

1 \ 

Lli 

- 0.44zA 

l - 0.44z~ 1 ) 


( 1 “ ^ \ 2 - 

1 V 1 w 

\ 1 - 0.44 / 

I 

N 

1 

1 



17 Technically, the vanishing ol a spectral density at frequency zero does nol imply 
that the associated stochastic process is the first difference of another that is covariance 
stationary. Quah (1989) provides regularity conditions for this implication to hold. It is 
easy to verify that these conditions are satisfied here. 
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(1 - yz)( 1 - yz~ l ) - ( i~ 0^ 4 )0 " 0.44t)(l - 0.44*“*) 

' (1 - 0.44z)(l - yz)(l - 0.44z” *)(1 - yz" 1 ) ' 

Since the numerator of the right-hand-side expression vanishes at 
z = 1, we can divide both sides by (1 - z)(l - z -1 ) to obtain 

var(e 0 ) • A 0 (z)A () (z~ ') = var(e) 

(1 - yz)(l - yz- 1 ) - ( y - _~ q ] - 4 -) 2 (1 " 0.44z)(l - 0.44Z" 1 )] 

(1 - z)(l -z _1 )(l - 0.44z)(l - yz)(l - 0.44z _I )(l - yz~') 

j 2 x 0.44 

0.44z*')(l - yz -1 ) ' 

But this last expression is simply the covariance function of a second- 
order autoregressive process, 

*o«) = (0-44 + y)Y u (t - 1) - (0.44 ■ y)Y 0 (t - 2) + e«,(0, 

where 


= var(e) 


- ( 1 ~ 7 
1 - 0,44 


(1 - 0.44z)(l - yz)(l - 


var(eo) = var(e) ■ 



1-7 \ 2 
1 - 0.44 j 


X 0.44 


is positive since 0.44 < y < 1. 

Recall that the contribution to var(AG') is d)((3) 2 • var(ei) for dis¬ 
turbances €| that have a permanent impact on labor income, and 
(1 - p) 2 /l()0) 2 • var(e 0 ) for disturbances e () with only transitory effects. 
For the example here, these are 

(1 - ypr 2 (l - y)~( 1 - 0.44)~ 2 x 636.1 


and 


(1 - W*d - 0.44P)- 2 (1 - y(J)- 2 [y - ( -yL JL Jj 


0.44 


x 636.1, 


respectively. The predicted variance of AC is simply the sum of 
these. For y = 0.5, the implied value of var(AC) is 1,989; for y = 
0.75, var(AC) = 1,916; for y = 0.8, var(AC) = 1,882; for y = 0.9, 
var(AC) - 1,727; fqi^y = 0.95, var(AC) = 1,493; for y = 0.99, 
var(AC) = 1,017; and for y = 0.995, var(AC) * 1,083. 

Allowing agents to distinguish permanent and transitory compo¬ 
nent^ in labor income, therefore, can potentially smooth the con¬ 
sumption implied by the PIH. Even when Fj is restricted to be a first- 




MOVEMENTS IN LABOR INCOME 


463 

order autoregression in first differences, consumption volatility can 
fall by as much as one-half over that when agents forecast labor in¬ 
come using only its past history. The intuition of the previous section 
is therefore correct: altering the cleaving of a fixed outer envelope 
spectral density S affects the PIH prediction for the volatility of con¬ 
sumption. This consumption smoothing occurs without one’s having 
to change any of the univariate properties of the labor income pro¬ 
cess. 

To complete the argument, we need to show that a cleaving exists 
that reconciles the actual volatility in consumption with the predicted 
volatility, taking as given the univariate dynamics in labor income. 
This existence question can be formulated in terms of an infinite¬ 
dimensional optimization problem. Take as given (1) a Wold decom¬ 
position (A, e) for AK and (2) a real interest rate r implying a value for 
p. What is the minimum value of A,(p) 2 var(ei) + (1 - P) 2 A 0 (P) 2 x 
var(e 0 ) such that (8) is satisfied? Formally, we have to solve 

inf A,(p) 2 var(e,) + (1 - p) 2 A„(P) 2 var(e n ) 

Ai,Ao,var(ti),var(«o) 


subject to the conditions that (a) 

var(e)|A(? - ““)| 2 = var(« r )jA l (e“'“)| 2 


+ var(eo)| 1 - e "“| 2 |A () (e , “’)| 2 for all <0, 

and (b) (A 1 , e,) and (A 0 , e u ) are Wold representations. The natural 
parameter space is infinite dimensional and equals P x p X R 2 . 
Given conditions 1 and 2, consumption displays excess smoothness if 
this program has value exceeding the sample estimates of var(AC). 

Instead of solving this problem directly, it is sufficient to display an 
example satisfying conditions a and b that achieves a value equal to 
the sample estimate of the variance of consumption changes. As be¬ 
fore, a choice for A j immediately determines all the remaining pa¬ 
rameters. Equality of S1 and S at frequency zero fixes the innovation 
variance var(ei) in the permanent component. The pointwise equality 
for all o), given as 11 - e~“‘ , | 2 So(w) = S(w) - 5 i(<d), determines the 
function S 0 - This spectral density S 0 can then be factored to obtain 
uniquely var(t 0 ) and A 0 in P, where A o (0) = 1 and A 0 (z) i* 0 for all 
| z | < 1. This calculation is standard since the resulting spectral density 
is, by construction, a rational function (see, e.g., Rozanov 1967, chap. 
1, sec. 10, pp. 43-50). |j4 

More explicitly, fix a candidate Wold representation (A, e) for af r . 
Consider lag distributions A1 of the form 


A,(z) 


(1 + z)" 

AUz) ’ 
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where A u is some fixed polynomial, such that A !rf (0) = 1 and A ]d (z) ^ 
0 for |z| s l. 18 This restricts the permanent component Y 1 to be an 
ARIMA process, with the moving average part having binomial coef¬ 
ficients. As q increases, the spectral density var(e,)|Ai(e -, ")| 2 is 
guaranteed eventually to be bounded from above by any fixed spec¬ 
tral density that shares the same value at frequency zero. 19 By 5,(0) = 
5(0), it follows that 

var(e,) = 4~M, rf (l) 2 A(l) 2 X var(e). 


In the second step, the dynamics of AF 0 satisfy 
var(e 0 ) • (1 - z)(l - z“ l )A 0 (z)A 0 (z _ ') 


= var(e) 


A(z)A(z~ ’) - 4-’A lrf (l) 2 A(l) 2 


(i + z)*(i + z~'yi ~ 
A\A z )A\,i{z ') 


The right-hand side above vanishes at z = I, by our choice of var(e,). 
We can therefore write 


var(e 0 )A 0 (z)A 0 (z ') 


= var(c) 


A(z)A(z~ *) - 4*A lrf (l) 2 A(l) 2 


(i + z)*(i + z -'n 


Airf(z)A| rf (z ') 


(1 - *)(1 - z - 1 ) 


For sufficiently large q, the right-hand side is the covariogram of a 
real covariance stationary process. We can therefore factor it to obtain 
var(€i) and A 0 such that (1) A 0 (0) = 1, (2) the power series expansion 
of A 0 (z) is one-sided in nonnegative powers of z, and (3) Aq(z) 0 for 
all |z| < l. 20 Finally, when a value for p is taken, the resulting lag 
distributions and innovation variances can be used to find the implied 
variance of consumption changes. 

Let us fix P to the value implied by a risk-free interest rate r of 1 
percent per quarter. 21 Tables 1-9 display the results of the procedure 
above for each of the candidate Wold representations for AF in West 


18 We take the polynomial A,,,(z) here to be (I - 0.8z)(l - 0.85z). This fixes the 
dominant root in the autoregressive part of A„ at 0.85. If this is not done, it might seem 
that the procedure simply trades off a declining importance in the permanent compo¬ 
nent for the transitory component approaching nonstationarity. The exact choice, 
however, is arbitrary otherwise. 

IH This will then satisfy condition a. The moving average form here is also known to 
minimize the innovation variance of a process that has spectral density fixed at fre¬ 
quency zero (see Quah 1989). However, this second fact is not directly useful here. 

20 The lag distribution A„ is, in fact, simply the series expansion of a rational func- 

ti8h. The denominator antkzuimerator parts can therefore be obtained separately by a 
standard algorithm, such as that in Wilson (1969). i 

21 This is the value that Christiano (1987) uses. West (1988), on the ofher hand, uses 
r = 0,5 percent. The results do not much depend on exactly which value is assumed, 
as long as r is strictly positive. 
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(1988). West estimated a variety of models to check the robustness of 
his findings. Although the results here lead to the opposite conclusion 
from his, here, as in his work, the findings do turn out to be insensi¬ 
tive to the exact parametric specification. The first section in each 
table gives point estimates for alternative autoregressive moving aver¬ 
age (ARMA) parameterizations of the Wold representation of AF. In 
our notation, the ARMA parameters 4> and 0 satisfy A(z) = (1 - 4>iz 
— <j> 2 z 2 ) _1 (l + 0 1 z 4- 0 2 z 2 ). The innovation variance associated with 
the given ARMA parameterization is presented next. These always 
exceed 600; the largest value is naturally that in table 1—the random 
walk case. Next the implied measure of long-run persistence is pre¬ 
sented, as, for example, in Campbell and Mankiw (1987). Following 
that, vji 0 is the square root of the ratio of PlH-predicted var(AC) to 
var(e). Finally, 4» is the actual square root of the ratio of var(AC) to 
var(e) found in the data. The discrepancy between t|/„ and 1 J 1 is one 
representation of the Deaton paradox. 

The second section in each table shows the implied variance of 
consumption changes due to the hypothesized permanent and transi¬ 
tory components in labor income. For alternative settings of q —the 
moving average length in AF|—I show first the individual variance 
contributions of the different kinds of disturbances and then the sum 
of these contributions. Notice that the contribution of t| is always 
much larger than that of e () . This is consistent with the message in 
Lucas (1987, chap. 3) that cyclical fluctuations, by comparison with 
secular movements, are simply not significant for many economic 
questions. 

The last row in this section shows «|» j, the square root of the ratio of 
the implied var(AC) to var(e). As q increases, the value of 1 falls 
monotonically. In the last column of this section, I show the value of q 
that implies i]/i = <|l Finally, the last section of each table presents the 
value for var(to) associated with that q that matches vj> 1 to 

The last column in the second section of each table therefore an¬ 
swers positively the existence question posed above. For all nine 
ARMA models hypothesized for the Wold representation of AF, 
there exists a permanent-transitory decomposition that exactly 
matches the PIH predicted consumption volatility with the data. For 
all nine models considered, the long-run measure of persistence is 
substantial. Despite this, for all nine models, the PIH—properly con- 

sa Although it is not presented here, I have verified that the zeros of the autoregres¬ 
sive and moving average pacts in the transitory component are strictly outside the unit 
circle. This property is guaranteed by the algorithm used to factor the Spectral density 
(Wilson 1969). The condition on the moving average zeros guarantees that the repre¬ 
sentations for both and Y 0 are fundamental, which is necessary for applying the 
Hansen-Sargent formula. 
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sidered—does not predict consumption volatility exceeding that in the 
data. Along this dimension, therefore, the PIH is not inconsistent 
with the data. In summary, the PIH predictions do not particularly 
depend on (1) whether labor income is best characterized as being 
integrated, (2) what the magnitude of labor income’s long-run persis¬ 
tence is, or (3) what the exact form of labor income’s univariate dy¬ 
namics is. By the nature of the argument, it should be clear that these 
conclusions would hold regardless of the exact parameterization of 
the Wold representation for AF, even beyond the ARMA(2, 2) cases 
explicitly considered here. 

Are the permanent-transitory decompositions that reconcile the 
PIH with the data reasonable ? It is difficult to interpret directly the 
permanent and transitory components used here for forecasting since 
they are not necessarily structural economic disturbances. Thus one 
should not read the q values in the last column of the tables as saying 
that economic agents perceive structural shocks with permanent ef¬ 
fects as very long AR1MA processes. The true structural shocks 
agents see are likely to be imperfectly correlated across disturbances 
with permanent and transitory effects. 

It might be interesting to explicitly identify the structural distur¬ 
bances that agents see driving labor income. However, such an exer¬ 
cise is not at all relevant in the current context. Instead, here we might 
simply compare our transitory component in labor income with sta¬ 
tionary components that others have estimated. Figure 2 plots the 
response in labor income to a unit disturbance in the transitory com- 
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Fic. 2.—Labor income response to an innovation in the stationary component 
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ponent. The response for each of the nine models considered in the 
tables is graphed. In every case, the effects have a hump shape and 
decay rapidly: no more than half the original impact of the distur¬ 
bance remains after 4 years. This is quite consistent with the moving 
average representation that, for instance, Blanchard and Quah (1989) 
call the dynamic effects of “aggregate demand” in their study of gross 
national product. 

V. Effects of Agents’ Superior Information 

West (1988) and Campbell and Deaton (1989) have also emphasized 
that economic agents are likely to have more information than the 
econometrician. Equilibrium consumption is then necessarily 
smoother than when agents use only the past history of labor income 
to forecast future labor income. The question becomes, Could this 
superior information effect suffice to account for the observed 
smoothness in the data? West and Campbell and Deaton find the 
answer to be no. 

Suppose that a researcher attempts to account for this superior 
information by studying the history of time-series observations on 
consumption and income. 28 Recall that consumption is a martingale 
under the rational expectations version of the PIH. Thus it is natural 
to suspect that the history of consumption should contain all the 
relevant information that agents use to forecast the future. In other 
words, even though economic agents are likely to have more informa¬ 
tion than the researcher, studying the joint consumption-income pro¬ 
cess should allow discovery of the correct relation between news-and 
the reaction in consumption, even though the researcher never di¬ 
rectly observes news. This argument appears to be related to a result 
in Hansen and Sargent (1981). Their theorem shows that, under 
certain conditions, the hallmark rational expectations cross-equation 
restrictions hold, even when the researcher uses an information set 
strictly smaller than that of economic agents. 

The explanation given in this paper of course says that agents have 
more information than the econometrician. Why, then, doesn’t con¬ 
sumption appropriately reveal the news that agents see? The reason 
for this is interesting in its own right: the PIH turns out to imply that 
agents observe innovations that are not fundamental for the joint 
consumption-income process. I now show this explicitly. 24 

23 West (1988) and Campbell and Deaton (1989) use more information than this; 
therefore the statements Mow do not apply to their work. 

24 This nonfundamentalness is a property of the Hilbert spaces spanned by the 
history of the *'s and that of the observed sequences AK and AC. A fettle reflection, 
therefore, shows that it is invariant to whether or not the structural shocks to the 
economy are correlated or, equivalently, whether or not «’s are the structural inno¬ 
vations. 
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Under the PIH and the assumptions on permanent and transitory 
fluctuations, the joint process for the changes in labor income and 
consumption is 


/AT(m 

_ /A|(L) 

d 

- L)Ao(L)\ 

/ei(0\ 

Uc«)j 

U,0) 

d 

- P)Ao(U)J 

Wt)J 


The determinant of this matrix moving average is the function 

8(z) = (1 - P)A 0 (P)A,(z) - A,(P)(1 - z)Ao(z). 

Two features of this function should be noted here. 

1. The determinant 8 is different from zero at z = 1. The spectral 
density of the jointly covariance stationary vector (AT, AC) is therefore 
of full rank at frequency zero. In words, labor income and consump¬ 
tion are not cointegrated. (Campbell [1987] has made the same obser¬ 
vation.) 

The intuition is straightforward: the martingale consumption im¬ 
plication of the PIH means that any news will have a permanent 
impact on the level of consumption. In particular, even news that has 
only transitory effects on labor income has permanent effects on con¬ 
sumption. 

2. The determinant 8 vanishes at z = p, which is strictly inside the 
unit circle. But then ei and e 0 , as well as all linear combinations of 
them, cannot be recovered from observations on current and lagged 
values of the exogenous and endogenous variables AT and AC (see, 
e.g., Rozanov 1967, p. 63, remark 3). 

Note that this does not mean that agents are somehow forecasting 
using future values of labor income. Recall that agents use only cur¬ 
rent and lagged values of Tj and F(> to predict future labor income. 
The “nonfundamentalness” means simply that the observed variables 
contain strictly less information than that in the e's. We can see explic¬ 
itly the implications for inference by considering a simple example. 

Consider the Friedman-Muth model (Muth 1960): Suppose that 
the permanent component is a random walk, Ai(z) = 1, and that the 
transitory component is white noise, A 0 (z) = 1. Further, assume that 
the innovations €i and to have unit variances. If we substitute into (9), 
these assumptions imply that agents in the economy observe the 
bivariate income-consumption model 

(AK ( m /Id- iwtiwy (10) 

lAC«J ll(l- P)/Uo«)j ( 

When an econometrician studies the history of observations on labor 
income and consumption, the most information she can recover is 
their true Wold representation: the projection of (AT, AC) on its 
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lagged values. 25 It is not hard to show, given (10), that the unique 
Wold representation for (AK, AC) with a pairwise orthogonal unit 
variance innovation vector is 


/Ar(*)\ 
l A C(t)l 


= Mpr 


-1 ( 2 - 0 ) 1 


-Ijll 


m‘ 2 




(1 - pZ.)\/T,,(/)\ 

0 / \ *02(0 


( 11 ) 


where X(P) 2 = 1 + (1 - P) 2 . The disturbances ip and t) 2 are pairwise 
orthogonal white noise having unit variance. 25 While the determinant 
of the matrix moving average in the agents’ model (10) is z — (3 
and vanishes at (3 < 1, that in the econometrician’s model (11) is 
\(P)(1 — pz), which vanishes nowhere on the unit disk. The 
econometrician’s representation is therefore fundamental. 

The cross-equation restrictions in (9) completely describe the pre¬ 
dicted response of consumption to news in labor income. A distur¬ 
bance to labor income, whose first difference has Wold lag distribu¬ 
tion (1 - z)Ay(z), should lead to a consumption response of (1 - 
P)A y (P). Consider the econometrician’s representation (11). In re¬ 
sponse to an t)! disturbance, consumption responds by X(P), which is 
exactly X(P) -1 (2 - P){1 - 1(1 - P)/(2 - P)]z} evaluated at z = p. Thus 
the econometrician will infer that consumption appears to respond 
appropriately to the disturbance ip. Next, consider an 1)2 disturbance. 
The econometrician reasons that consumption should respond to 1)2 
by X(P)~ '(1 - p 2 ) > 0; however, in the data, consumption does not at 
all react to t} 2 . In other words, consumption will appear to be “exces¬ 
sively smooth,” even though the joint income-consumption process 
satisfies the PIH. 

This example shows why an econometrician, studying the past his¬ 
tory of observed model variables, might not draw the correct infer¬ 
ence on the dynamic effects of different disturbances. 27 It is not the 
case that agents observe future events that the econometrician need 
only wait to observe similarly. Agents in the model condition their 
actions only on observations of the past history of permanent and 
transitory disturbances; no future information is involved. 

From (9), our theory clearly implies restrictions across the equa¬ 
tions for income and consumption. In principle, the model (9) could 
be estimated and tested, for example, by maximizing the Whittle fre¬ 
quency domain likelihood. The results of that exercise are known in 
advance, though. Aggregate consumption is not a martingale, and so 

2 ® Without loss, we can raatricl analysis to the second-moment properties of the data. 

26 We obtain this by Rozanov's (1967) discussion (chap. 1, sec. 10, pp. 43-50). It is 
straightforward to verify that the matrix covariograms implied by the right-hand sides 
of (10) and (11) are identical, 

This does not contradict Hansen and Sargent’s (1981) results, The PIH restrictions 
on consumption and income are not of the form in their theorem. 
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the PIH would be rejected (see, e.g., Chrisdano et al. 1987; Nelson 
1987; Heaton 1988). But evidently this would happen for reasons 
other than “excess smoothness.” 

VI. Conclusion 

Deaton (1987) noted that (1) the permanent income hypothesis, (2) a 
unit root for the labor income process, and (3) the estimated univari¬ 
ate dynamics of U.S. aggregate labor income together imply volatile 
consumption. Such volatility is not seen in the time-series data on 
aggregate consumption. However, if labor income is modeled as 
trend stationary, then the permanent income hypothesis implies con¬ 
sumption volatility roughly in line with that in the data. Deaton’s 
finding makes a strong case that the univariate dynamic properties of 
labor income—labor income’s difference stationarity or trend sta- 
tionarity, its long-run persistence, or its univariate short-run dynam¬ 
ics—are relevant for evaluating an important economic hypothesis. 
More generally, it argues for the importance of measures of persis¬ 
tence, as, for instance, articulately proposed in Campbell and Mankiw 
(1987). 

This paper has shown why the reasoning above is misleading. By 
making the plausible assumption that agents observe different kinds 
of disturbances to their labor income stream, one can bring back the 
volatility predictions of the permanent income hypothesis firmly in 
line with the data. This can be done regardless of the precise form of the 
univariate dynamics in labor income. Quite generally, therefore, this pa¬ 
per argueS that the univariate characterizations of aggregate time 
series are simply not informative for economic theory. 

The idea that there are permanent and transitory disturbances in 
time series is an old one, going back at least to Milton Friedman. This 
assumption raises interesting testable hypotheses in many areas of 
empirical time-series research. That agents "see” things unobservable 
to the econometrician has already led to many useful insights, such as 
the notion of human capital in labor economics and growth. In the 
current paper, it has served to explain a puzzle, in which consumption 
smoothness seemed to be inconsistent with a unit roots representation 
for labor income dynamics. 

It is important to emphasize what the paper does not do. First, the 
unit roots hypothesis has been critical in reorienting econometric in¬ 
ference and modeling. It has provided rich insights for reinterpreting 
evidence on many interesting economic propositions. 28 This paper 
does not at all argue against this. The results in this paper do, how¬ 
ever, lead one to be extremely skeptical of conclusions such as those in 

2S For examples, see the excellent paper by Stock and Watson (1988) and references 
therein. An opposing view is presented in Sims (1988). 
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Deaton (1987) and Campbell and Mankiw (1987). The focus in those 
papers on "large” versus “small” unit roots—an unfortunate termin¬ 
ology introduced in Cochrane (1988)—and the idea that, somehow, 
this has something to do with interesting economic hypotheses appear 
unjustified. It would be interesting to display an explicit economic 
model in which this faith is, in fact, well placed. 

Second, the paper does not say that the permanent income hy¬ 
pothesis accurately describes the aggregate time series. The martin¬ 
gale predictions for consumption, originally developed in Hall (1978), 
are now well known to be false in the data. Allowing agents to observe 
permanent and transitory disturbances separately does not alter this 
conclusion. 
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This paper examines the entry implications of physician advertising. 
Evidence suggests that advertising inhibits entry into this market. 
Nevertheless, experienced physicians (incumbents), to whom adver¬ 
tising would offer the greatest financial benefit, in fact advertise 
less—a paradox that may be explained by nonfinancial concerns, 
such as unwillingness to break well-internalized professional norms 
against advertising. Physician advertising has risen sharply in recent 
years, and it appears that this trend will continue. If incumbents 
increasingly resort to advertising, there could be a substantial redis¬ 
tribution of income from less-well-established physicians to better- 
established ones. 


How does advertising affect competition? This question has been of 
considerable interest to sellers seeking to gain an edge, policymakers 
attempting to set appropriate rules, and economists in search of their 
Holy Grail: an understanding of market function. We join the econo¬ 
mists in their quest. General conclusions about the impact of advertis- 
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ing on competition do not emerge from the literature because the 
specifics of market structure, such as information conditions and the 
availability of capital to new competitors, are so important. In their 
survey of the literature, Comanor and Wilson (1979, p. 470) conclude 
that 

the weight of available evidence is consistent with the hy¬ 
pothesis that heavy advertising can have substantial anticom¬ 
petitive consequences. However, because the distribution of 
advertising intensities is highly skewed, there is no indication 
that these effects are pervasive throughout the economy, or 
even within the manufacturing sector. Rather, they appear 
to be concentrated in a number of industries with high ad¬ 
vertising-sales ratios and/or high absolute levels of advertis¬ 
ing per firm. 

More recently, Kessides (1986) used data on U.S. manufacturing 
industries to study the relationship between advertising and firm en¬ 
try from 1972 to 1977 for 266 U.S. industries. His results suggest that 
advertising promotes entry in the strong majority of industries but 
may retard it in a few. 

We examine advertising in the market for physician services. I'he 
problem is salient given the magnitude of the resources involved 
(physicians are responsible for most decisions in the health care sec¬ 
tor, which by 1986 constituted 10.9 percent of gross national prod¬ 
uct); 1 the information conditions in the market (there is no standard¬ 
ized product and quality is hard tojudge); and the considerable policy 
interest in whether inhibitions to physician advertising are anticom¬ 
petitive. We focus on entry, broadly interpreted as the ability of new 
participants to secure a share of the market, as reflected in earnings. 
Most previous work on advertising and entry has addressed manufac¬ 
turing, which provides limited guidance on effects in the physician 
services market. 

Leffler’s (1981) work on the effects of advertising of prescription 
drugs may be germane since reputation is critical to sales in both 
businesses. (Prescription drugs are far more heavily advertised than 
physician services, however.) Leffler finds that advertising promotes 
the entry of superior new drugs but probably retards the entry of low- 
priced close substitutes. 

The effects of advertising on entry have not been studied for the 
physician services market. Evidence presentedTby Folland (1987) indi¬ 
cates that less experienced physicians are more likely to advertise. 

1 The latest year tor which actual figures (rather than estimates) are available is 1986. 
See Health Care Financing Administration (1987, p, I). 
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This might seem to suggest that advertising offers greater benefits to 
entrants (less experienced physicians) and thus will promote competi¬ 
tion. However, the advertising decision may depend on other factors 
besides the desire to increase income. A strong ethic against advertis¬ 
ing has been an established norm in medicine until recently. Older 
physicians may feel quite uncomfortable with advertising and may 
choose not to advertise even if doing so would improve their income. 
In a world in which attitudes toward advertising are changing, a 
greater propensity to advertise need not imply that greater financial 
benefits are realized. Our analysis will explicitly test the naive conclu¬ 
sion that young doctors secure more financial benefit from advertis¬ 
ing than older ones. 

Advertising may affect competition in ways apart from influencing 
entry. For example, it may affect the market among more established 
physicians, giving a greater market share to those able to project a 
more positive image. Or, if market shares are relatively unchanged, it 
could increase price competition or merely impose advertising costs 
on producers. (Just such arguments have been used to suggest that 
bans on cigarette advertising may raise tobacco company profits.) On 
the other hand, as Chamberlin (1962, p. 72) noted many years ago, 
advertising can create the type of perceived product differentiation 
that promotes the monopolistic component of monopolistic competi¬ 
tion. 

Our interest remains focused on how advertising affects entry in 
the market for physician services. To estimate this relationship, we 
use a two-stage switching regression model similar in format to the 
union-nonunion wage model of Lee (1978) and the educational 
choice model of Willis and Rosen (1979). More specifically, we esti¬ 
mate annual earnings for physician advertisers and nonadvertisers, 
adjusting for possible selection effects. 2 Then we compare the rela¬ 
tionship between earnings and an array of physician characteristics, 
such as years of experience. This allows us to draw inferences about 
whether physician advertising promotes entry, by acting as a substi¬ 
tute for consumer familiarity with a service (presumably consumers 
are more familiar with physicians who have practiced longer), or 
whether it tends to inhibit such competition, by acting as a comple¬ 
ment to consumer familiarity. 

The paper is divided into six sections. Section I discusses the nature 

1 An alternative possibility is to use hourly earnings. However, annual earnings pro¬ 
vide a better measure of the benefits from physician advertising. For instance, to the 
extent that advertising increases the physician's caseload, it will also increase annual 
earnings but could well decrease hourly earnings. We thank an anonymous reieree for 
pointing this out. While annual earnings are a preferable measure for our purposes, 
using hourly earnings leads to results very similar to those reported in the text. 
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of physician advertising. Section II presents a brief history of adver¬ 
tising in the medical profession, highlighting the vigorous efforts of 
the Federal Trade Commission (FTC) to eliminate advertising restric¬ 
tions in this industry. Section III formally states the hypotheses to be 
tested, while Section IV specifies the empirical model to be estimated. 
The estimation results, using data from a 1987 American Medical 
Association (AMA) survey, are presented and discussed in Section V. 
Section VI presents conclusions. 

I. The Nature of Physician Advertising 

The available evidence (there have been no academic studies) sug¬ 
gests that physician advertisements consist primarily of objective in¬ 
formation, such as practice location and specialty, in contrast to qual¬ 
ity claims, testimonials, and glitz. Survey results reported by Folland 
(1987) indicate that over 90 percent of physicians believe that it is very 
difficult to advertise competence and quality of services. Many physi¬ 
cians (69 percent in Folland’s survey) also believe that advertising will 
damage their prestige. These beliefs may explain the absence of qual¬ 
ity claims and testimonials in physician advertising. 

Peer pressure to avoid particularly undignified forms of advertis¬ 
ing, such as quality claims and testimonials, may be quite strong. In 
discussing trends in physician advertising, Gray (1986, p. 188) points 
out that while physicians may engage in “low-profile” forms of adver¬ 
tising, such as information brochures, telephone stickers, and patient 
newsletters, “the stigma of commercialism still taints some more overt 
forms of advertising. Peer pressure continues to deter some doctors 
from going ahead—and continues to sting those who do.” 

In addition, the FTC applies particularly strict standards of truth¬ 
fulness to advertisements by physicians.Barney (1985, p. 5) quotes 
an FTC statement on this issue: “What may be false and deceptive for 
doctors may be permissible for sellers of other products and services. 
Harmless puffery for a household product may be deceptive in a 
medical context.” Thus stricter regulatory standards may also limit 
the use of quality claims, testimonials, and other “puffery” in physi¬ 
cian advertisements. 

Physician advertising of fees appears to be rare. Folland (1987) 
reports that over 7Q percent of physicians feel that advertising of fees 
will adversely affect their public image, and 60 percent do not believe 
that fee advertising will offer them any personal benefit. 

5 The FTC's strict stance on deceptive advertising in the medical profession appears 
to have come in response to pressure exerted by the AMA to reduce the FTC’s author¬ 
ity in medical markets. We thank an anonymous referee for pointing this out. 
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New developments in physician advertising can be expected over 
the next few years. In response to a perceived lack of consumer infor¬ 
mation about alternative physician choices, a physician advertising 
industry is beginning to emerge. Firms that handle physician adver¬ 
tisements contract directly with physicians, obtaining detailed infor¬ 
mation about each. These agencies then advertise their services to 
consumers. 4 

II. Restrictions on Physician Advertising: 

A Brief History 

Until the 1980s, advertising was frowned on by medical organizations. 
In its first code of ethics, published in 1847, the AMA referred to 
advertising as “the ordinary practices of empirics, highly reprehen¬ 
sible in a regular physician” (Leake 1975, p. 224). Perhaps in response 
to the threat of FTC intervention, this position was modified to per¬ 
mit limited advertising. In 1976 the AMA's Judicial Council on adver¬ 
tising and solicitation by physicians stated that neither its “long¬ 
standing policy” nor the AMA’s Principles of Medical Ethics prohibited 
advertising. Rather, solicitation was opposed: 

The Principles do not proscribe advertising, they proscribe 
the solicitation of patients. Advertising means the action of 
making information or intention known to the public. . . . 
The term "solicitation” in the Principles means the attempt to 
obtain patients by persuasion or influence, using statements 
or claims which (1) contain testimonials; (2) are intended or 
likely to create inflated or unjustified expectations of Favor¬ 
able results; (3) are self-laudatory and imply that the physi¬ 
cian has skills superior to other physicians engaged in his 
field or specialty of practice; or (4) contain incorrect or in¬ 
complete facts, or representations or implications that are 
likely to cause the average person to misunderstand or be 
deceived. [1976, p. 2328] 

Discontent with these limitations, the FTC pursued litigation 
against organizations that tried to restrict advertising in the medical 

r • ' 

* One such company is Consumer Health Services. Based in Boulder. Colo., it cur¬ 
rently serves seven metropolitan areas: Chicago. Dallas/Fort Worth, Denver, Houston. 
Kansas City, Milwaukee, and Washington. D.C. In selecting a physician through these 
agencies, consumers are asked to state their needs in general terms and are then given 
detailed information abouLa variety of physicians who might be suitable. This informa¬ 
tion includes practice location, specialty, and board certification status, as well as other 
"* important details such as fees, treatment philosophy, and bedside manner. 
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profession. 5 In 1975 the FTC issued a complaint against the AMA, 
alleging that the association illegally restrained trade among physi¬ 
cians in violation of section 5 of the FTC Act by preventing so¬ 
licitation of business by advertising. In 1980 the New York federal 
appeals court ruled in favor of the FTC, arguing that the commission 
had the authority to regulate the competitive practices of professional 
organizations. A 1982 appeal to the Supreme Court resulted in a 4-4 
decision deadlock, so that the opinion of the lower court was upheld. 6 
This outcome empowered the FTC to forbid AMA bans on advertis¬ 
ing for the solicitation of patients, except in cases in which advertising 
was false or deceptive. 

These actions facilitated a dramatic rise in physician advertising. 
While less than 5 percent of self-employed physicians advertised in 
1982, by 1987 this figure had risen to 20 percent. 7 As physician sup¬ 
ply grows rapidly in the coming years, 8 9 there is every reason to sus¬ 
pect that this trend toward increased advertising will continue. 6 

III. Hypotheses to Be Tested 

The rapid growth of physician advertising and its uncertain influence 
on consumers raise questions about its economic consequences. Our 
analysis considers the entry effects of physician advertising. A physi- 


5 In addition the Supreme Couri determined, in the landmark case ol Bales v. the 
Stale Bar of Arizona (1977), thal comprehensive restrictions on advertising by profes¬ 
sionals through stale laws and regulations were unconstitutional. This decision in effect 
rescinded state laws that prohibited any advertising by professionals, including phy¬ 
sicians. 

b The ninth justice, Harry Blackmon, disqualihed himself Irom the case without 
explanation. One source speculated that his decision may have stemmed from his prior 
association "with the medical profession as counsel to the Mayo Clinic in Minnesota 
during the 1950s" (see Pecarski 1982). 

7 These figures are obtained from, respectively, the AMA's Socioeconomic Moni¬ 
toring System (SMS) tor the fourth quarter of 1982 and core 1987 surveys of physi¬ 
cians. (Core surveys, conducted annually, are the largest and most comprehensive of 
the SMS surveys.) 

8 For example, a study by Klelke, Marder, and Silberger (1987) projects that the 
physician population will increase by 84 percent between 1985 and 2000, growing 
much faster than the U.S. population as a whole. 

9 Survey data on physician attitudes toward advertising support this claim. Folland 
(1987, p. 315) reports that almost one-half of the physicians surveyed “state that they 
will increase their use of marketing techniques when faced with increased competitive 
pressures. The relevance of this statement to advertising growth is clarified by related 
responses to items on attitudes and perceptions. Over one-half of the physicians state 
that they expect competitive pressures to increase. Furthermore, a large majority be¬ 
lieves, quite contrary to the marketing literature, that marketing is merely a synonym 
for advertising. Thus, the picture emerges of a professional group that is reluctant to 
advertise but increasingly willing to do so in response to economic incentives." 
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dan’s degree of entry will be taken to be inversely related to years of 
practice experience. Female physicians and foreign medical gradu¬ 
ates (FMGs) will also be considered entrants. 

Entry is customarily understood to mean the decision to participate 
in a market at all. However, entry need not be regarded as an all-or- 
nothing proposition. There can be degrees of entry. For example, 
even if two firms coexist within a given market, one may be better 
established and may have already attained enough customers to keep 
its plants operating at full capacity. Its competitor, however, may be a 
newcomer with relatively few customers and substantial excess capac¬ 
ity. The latter firm may be thought of as an entrant relative to the 
former. 

In the same way, physicians with relatively few years of practice 
experience may be considered entrants. Such physicians are more 
likely to be building their practices and to have more excess capacity. 
Similar reasoning may be applied to female physicians and FMGs. 
Traditionally, male physicians and U.S. medical graduates have domi¬ 
nated the physician services market in the United States. In recent 
years, however, this pattern has been challenged by the relative 
growth of female physicians and physicians trained outside of the 
United States (Kletke et al. 1987). 

Using these definitions of entry, we shall test the following hypoth¬ 
eses: HI: advertising promotes entry; H2: advertising inhibits entry; 
H3: advertising does not affect entry. 

A finding that advertising raises the earnings of less-well-es¬ 
tablished physicians relative to their better-established competitors 
would support HI. This would suggest that advertising heightens 
consumer awareness of alternative medical care providers, thus pro¬ 
moting entry. On the other hand, if advertising lowers the relative 
earnings of less-well-established physicians, this would suggest that 
advertising increases loyalty to better-established “brands” of medical 
providers, decreasing the competitive threat posed by entrants. Such 
a finding would support H2. Finally, if advertising has no effect on 
the relative earnings of entrants and incumbents, we would conclude 
that it has neither helped nor hindered entry into this market. 


IV. Empirical Specification of the Model 

The specification of the model follows standard procedures used 
for estimating earnings with self-selection criteria. Willis and Rosen 
(1979) provide an excellent discussion of these procedures. 

Assume that the pfiysician expects to receive annual earnings equal 
to Y a if he or she advertises and Y b otherwise. The tth physician will 

advertise if Fa, > Yi» and will choose not to advertise if Y a , £ F w . 

v 
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With Pr as probability, the selection criteria are 
Pr(choose a) = Pr(F a > Yf,), 
Pr(choose b) = Pr(K„ s Y k ). 


( 1 ) 


If the t'th physician advertises, his or her earnings may be estimated as 
In Y a , = c a + d a - X, + e al , (2) 


where In Y al is the natural logarithm of annual earnings for the ith 
physician, X, is a vector of exogenous variables, c a and d a are coeffi¬ 
cients to be estimated, and e a , is an error term. Similarly, if the ith 
physician does not advertise, earnings may be estimated as 

In Yb, = ci, + df, ■ X, + e b i- (3) 


Self-Selection Effects 

To test the effects of advertising on physician earnings, we need to 
control for unobserved differences between the two cohorts in our 
sample. Advertisers and nonadvertisers may differ in ways that are 
not directly observable. For example, nonadvertisers may have stron¬ 
ger referral networks than advertisers. Advertisers may be physicians 
more skilled in self-promotion. Accordingly, our estimation strategy 
corrects foi* possible self-selection effects, using well-known econo¬ 
metric techniques (Heckman 1979; Maddala 1983). 

The decision whether to advertise depends on the vectors of exoge¬ 
nous variables X and Z: 

A, = g + h • X, + j • Z, + u„ (4) 

where A, equals one if the «h physician advertises and equals zero 
otherwise; g, h, and j are coefficients to be estimated; and u, is an error 
term. 

Estimation of selection effects requires that X and Z have elements 
that are not in common. The problem is to find an exogenous variable 
that affects the physician’s decision to advertise but has no other 
impact on earnings. For this purpose we use a dummy variable 
(MOVED) that is equal to one if the physician is not practicing in the 
state in which he or she was graduated from medical school and zero 
otherwise. 10 Movers should perceive more variability in their earnings 

10 One could alternatively define movers as physicians who practice in a state other 
than their state of residency training. Unfortunately, information on the individual 
physician's place of residency training is not readily available. Unpublished AMA data 
(which present nationally aggregated trends in physician movement patterns as of 
1982) suggest, however, that this alternative measure of physician movement would be 
highly correlated with the one actually employed here since more than two-thirds of all 
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potential than stayers, who are presumably more familiar with the 
market conditions in their practice areas and can thus put narrower 
bounds on their potential earnings. 

Advertising can reduce the variability of earnings even if prices 
remain fixed, as the following scenario illustrates. Define a good out¬ 
come as the earnings expected with a full patient caseload and a bad 
outcome as the earnings expected with a less than full patient case¬ 
load. If a good outcome occurs, advertising would add no benefits 
since a full patient caseload will already have been achieved. But 
advertising can have substantial benefits when a bad state of nature 
occurs, creating idle capacity. 

If movers perceive greater variability in potential earnings, they 
may expect to be left with more idle capacity than stayers if an un¬ 
favorable outcome occurs. Movers should be more likely to advertise 
in this case; (nonincreasing) risk aversion would reinforce this ten¬ 
dency. 

We expect that movers will tend to migrate to states having higher 
earnings opportunities. But will movers’ earnings differ systematically 
from those of stayers once differences in average earnings opportuni¬ 
ties have been controlled for? If they do, then the advertising equa¬ 
tion will not be identified, and further statistical analysis correcting 
for self-selection will be problematic at best. 11 

The conventional wisdom argues that movers should exhibit posi¬ 
tive self-selection bias. The assumption is that movers are a special 
group and that mobility costs increase the odds that the most capable 
and ambitious individuals will move. A recent study on the earnings 
of U.S. immigrants by Borjas (1987), however, challenges this.con¬ 
vention. He points out that the presence of positive selection effects 
for movers is an empirical question, one that cannot be resolved a 
priori. 12 Furthermore, he notes, positive self-selection “requires a set 


physicians receive their residency and medical school training within the same state. A 
number of other variables were tried in an attempt to identify the advertising equation. 
These include measures of consumer exposure to advertising, such as area pet capita 
newspaper circulation, and more aggregated measures oi physician mobility, such as 
the mean state-level percentage of physicians who received their medical education out 
of state. While the signs of the coefficients on these variables wete usually in the 
expected direction, they were not statistically significant. 

11 Maddala (1983, chap. 8) discusses the identification issue for two-stage switching 
regression models. He notes that identification requires that at least one explanatory 
variable he excluded from the second-stage estimates except in the special case in which 
the error term from one cohort is uncorrelated with the error term from a second 
cohort. 

18 A recent study of rural Mexican migration to the United States by Stark and 
Taylor (1988) provides atfitUeresiing counterexample to the conventional wisdom that 
movers should have higher earnings potential than stayers. The researchers found no 
evidence to suggest that the earnings of immigrants (as measured by remittances to 
their families in the home country) differ from what stayers could expect to earn if they 
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of conditions that will not be generally satisfied” (p. 552). 1S An impor¬ 
tant result of his analysis is that “foreign-born persons in the United 
States need not be drawn from the most able and most ambitious in 
the country of origin” (p. 551). 

Whether there will be any significant selection on earnings (positive 
or negative) for movers depends on the size of mobility costs, which 
are both monetary and psychic. 14 For immigrants to the United 
States, especially those traveling great distances and coming from 
very different cultural backgrounds, mobility costs may be substantial. 
For most physicians, such costs are far lower. 15 The physician who 


migrated to the United States. They did find, however, that the earnings of stayers ex¬ 
ceeded what the cohort of immigrants could expect to earn in Mexico had they chosen 
not to migrate. How might we explain this phenomenon? A plausible explanation is 
that actual earnings depend on business connections, landholdings, and other factors 
besides native ability. Mexicans who elect not to migrate may be relatively well endowed 
in terms of business connections, landholdings, and so on in the home country, and for 
that reason they may earn more than migrants could expect had they stayed in Mexico. 
Once migration occurs, native ability becomes more important because many of the 
additional factors noted (suth as business connections) may not be exportable. And 
when native ability becomes more important, the earnings differential between the two 
cohorts evaporates. T hus the Stark-Taylor results are consistent with ihe notion that 
migration may depend on unobserved factors (sue h as poor business connections in the 
home country or poor soil) that arc unrelated to native ability. At the same time, their 
results suggest that stayers may enjoy an advantage in these unobserved factors, which 
help determine earnings. These results seem to suggest that, contrary to the conven¬ 
tional wisdom, we should observe lower earnings for movers since the individuals with 
whom they must now compete presumably have better business connections, landhold¬ 
ings, and so on. However, an important point made by Borjas (1987) is that such gen¬ 
eralizations are dangerous: whether movers have higher, lower, or the same earnings as 
stayers may well vary from case to case. 

These conditions are that (1) error terms in the earnings of stayers are highly 
correlated with error terms in their earnings if they were to move, and (2) income is 
more dispersed in the areas in which movers locate. In the context of our analysis, the 
first condition requires that physicians who would earn above-average incomes in one 
state would also earn above-average incomes if they moved to another state. While one 
would expect there to be positive correlation here, it may not be sufficiently high to 
result in positive bias. Furthermore, positive selection requires that both conditions 1 
and 2 be satisfied. Empirical tests lor the existence ol positive selection (described in the 
text) suggest that they are not satisfied. 

14 Borjas (1987, p. 535) observes that "mobility costs ensure that only some per¬ 
sons . . . find it worthwhile to emigrate and thereby create the selection biases that 
are apparent in immigration data." An alternative potential source ol selection bias is 
that movers are less risk averse than stayers. This possibility has been rejected in the 
recent literature on labor migration. Movers tend to exhibit risk-averse behavior in 
other aspects of economic decision making, so there is little reason to suspect that they 
will exhibit risk-loving behavior just with respect to moving. See Katz and Stark (1986) 
for further details. 

15 Mobility costs may be substantial for FMGs, most of whom take their residencies in 
the same locale as their medical school, since many of these physicians will be coming to 
the United States from a foreign country to start their practices. Earnings of FMGs may 
also differ because their training differs from that of U.S. medical graduates and, for 
foreign-born FMGs, because of differences in cultural background. However, FMG 
status is explicitly controlled for in the earnings equation presented later in the paper. 
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moves incurs travel costs and sacrifices professional connections made 
during medical education. On the other hand, a number of states 
offer physicians financial incentives to practice there. 16 This could 
offset, or even outweigh, moving costs and opportunity costs from 
forgone professional connections. 

Finally, the importance of psychic costs as a barrier to physician 
movement is unclear. Some physicians want to locate their practice in 
a particular state other than that of their medical school, in which case 
moving provides psychic benefits. In any event, the psychic costs of 
locating in a different state are surely much lower than the psychic 
costs of leaving one’s country entirely. 

This discussion suggests that self-selection effects on earnings for 
physicians who move may not be important, either because mobility 
costs are not large or because mobility costs differ across physicians 
because of differences in unobserved characteristics that are not asso¬ 
ciated with earnings ability (e.g., physicians’ locational preferences). 
To illustrate, suppose that two physicians have equal earnings ability 
(equal credentials, native ability, and so forth). The only difference 
between them is that one likes the state in which he or she was gradu¬ 
ated from medical school while the other strongly prefers another 
state. The former physician will have higher mobility costs than the 
latter (if the psychic benefits from moving are factored in, mobility 
costs for the latter physician may be negligible). In this example, one 
physician moves and the other stays, but this outcome does not 
change the geographic distribution of ability. 17 

As a check on the empirical validity of this hypothesis, we included 
the variable MOVED along with the other variables used in the adver¬ 
tising probit regression (see table 1) to determine physicians’ annual 
earnings. We ran such a regression for the entire physician sample, as 
well as separate regressions for advertisers and nonadvertisers. In no 
case was any statistically significant relationship observed between the 
variable MOVED and physicians’ annual earnings. Thus it seems un¬ 
likely that there are systematic differences in the average earnings 
ability of movers versus stayers. 


Hence any observed ef fects of physician movement on earnings should be purged of 
selection effects induced by the potentially large mobility costs borne by FMGs, or by 
earnings differentials reflecting differences in FMG training or cultural background. 
Furthermore, the entry implications of physician advertising do not change when 
FMGs arc excluded from the analysis. 

16 For example, Burfield, Hough, and Marder (1986, p, 546) note that some states 
aUract physicians through a " 'beggar-thy-neighbor' policy, relying on other states to 
train physicians and then giving financial incentives to induce those physicians to re¬ 
locate.” 

17 The same result wouhTfbllow if two otherwise identical physicians differ in the 
amount of student loans they have incurred. The one with a heavier burden may 
choose a state with better earnings opportunities. 



TABLE 1 


Mean Valles for Variables Used in Study 


Variable 

Name 

All Physicians 
(N = 1,995) 

Nonadvertisers 
(N = 1,603) 

Advertisers 
(N - 392) 

ADVERT 

.20 

.00 

1.00 


(.40) 

(.00) 

(.00) 

In F* 

-2.07 

-2.06 

~2.11 


(.65) 

(.64) 

(.67) 

MOVED 

.65 

.63 

.71 


(.48) 

(.48) 

(.46) 

EXP/100 

.22 

.23 

.18 


(11) 

(.11) 

(94) 

(EXP/100) 2 

.06 

.07 

.04 


(.06) 

(.06) 

(.04) 

FEMALE 

.07 

.05 

.1 1 


(.25) 

(.23) 

(.32) 

FMG 

.18 

.17 

.23 


(.39) 

(.38) 

(.42) 

BDCERT 

.74 

.75 

.70 


(.44) 

(.43) 

(46) 

GROUP 

.44 

.42 

.53 


(.50) 

(.49) 

(.50) 

CORP 

.50 

.50 

.47 


(.50) 

(.50) 

(.50) 

AVGINC 

.81 

.80 

.81 


(.09) 

(.08) 

(.09) 

In AVGINC* 

- .22 

-.22 

-.21 


(.11) 

(.11) 

(.11) 

URBAN 

.62 

.63 

.57 


(.38) 

(.38) 

(.38) 

IMED 

.21 

.20 

.21 


(.40) 

(.40) 

(.41) 

SURGS 

.26 

.27 

.22 


(.44) 

(.44) 

(.41) 

OBGYN 

.08 

.09 

.07 


(.28) 

(.28) 

(.26) 

PED 

.07 

.07 

.07 


(.26) 

(.26) 

(.25) 

PSYCH 

.08 

.09 

.06 


(.28) 

(.29) 

(.23) 

OTHER 

,12 

.12 

.12 


(.33) 

(33) 

(32) 

SLFSLC 

.00 

.32 

-1.29 


(.67) 

(.16) 

(.31) 


Note —Standard deviations are in parentheses. 

* These values arc negative because earnings were normalized to lie between zero and one before the logarithmic 
transformation was taken 
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The model specified in Section IV is estimated using data from the 
AMA’s SMS 1987 core survey, which includes data on the advertising 
practices of a nationally representative sample of 4,014 physicians 
(response rate was 67,0 percent). The sample used for this study is 
restricted to self-employed physicians since employee physicians are 
likely to have little control over advertising decisions. 18 Also excluded 
were specialists such as anesthesiologists and pathologists, who tend to 
be hospital-based and to rely on referrals for patients, and thus are 
unlikely to consider advertising. 

These exclusions left a sample of 2,643 physicians. Missing values 
in some of the response variables (most notably in physician earnings) 
caused an additional 648 observations to be lost, so that the usable 
sample numbered 1,995 physicians. 19 Fortunately a comparison of 
the total sample of self-employed physicians with the sample of self- 
employeds who report earnings suggests that the two groups are very 
similar, at least in terms of measured characteristics. 

We begin by estimating a probit regression to determine the adver¬ 
tising decision. Advertisers are physicians who, at any time during the 
previous 5 years, had advertised their practices in newspapers or 
magazines or on television or radio. 20 In addition to the variable 
MOVED discussed above, other explanatory variables included are 
the physician’s sex, FMG status, board certification status, years of 
experience, and specialty. Dummy variables are also included that 
measure whether the physician’s practice is solo or group and 
whether the practice is incorporated. Fable 1 lists the mean values of 
the variables used in this study. (To facilitate comparisons of coeffi¬ 
cients, all continuous variables are normalized to lie between zero and 
one; variable names and descriptions are listed in the Appendix.) 

While the impact of these variables on the propensity to advertise is 
an empirical question, we expected that less experienced physicians, 
women, and FMGs would be more likely to advertise. Older physi¬ 
cians and those having “traditional” characteristics seem more likely 


18 Employee physicians are growing in number, but only gradually; the strong ma¬ 
jority of physicians (73.5 percent in 1987 according to AMA survey data) are self- 
employed. 

10 In their study on self-selection in educational choice, Willis and Rosen (1979) also 
noted a substantial number of missing values due to nonresponse to questions about 
earnings. Specifically, they lost 952 observations out of 5,085 respondents because of 
nonresponse to questions about initial and later earnings. Apparently many people are 
touchy about divulging this information, 

20 Unfortunately, the AMAsurvey did not ask advertising physicians how long they 
had been advertising. Since physician advertising is a relatively recent phenomenon, 
increasing fourfold over the period 1982-87, it seems likely that physicians who have 
been consistently advertising for 5 years or more are comparatively rare. 
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to adhere to the traditional canons of the medical profession, which 
frown on advertising. 

Group practice physicians should be more likely to advertise. (This 
hypothesis assumes that there is a positive relationship between the 
scale of an enterprise and the returns to a given amount of advertis¬ 
ing, as suggested by the evidence that a greater proportion of large 
than small firms advertise.) Board-certified physicians should be less 
likely to advertise if they obtain referrals more easily than uncertified 
physicians. 

We include a variable measuring whether the physician's practice is 
incorporated in order to test whether the incorporated physician is 
more likely to conduct his or her practice like a business and therefore 
to advertise. The propensity of specialists (relative to general/family 
practitioners) to advertise is unclear a priori. Since the demand for 
primary-care physicians (general/family practitioners, internal medi¬ 
cine specialists, and pediatricians) is substantially more price sensitive 
than the demand for surgical and other nonprimary care (see Pauly 
and Satterthwaite 1981), differences between primary- and nonpri¬ 
mary-care physicians in advertising may provide some evidence on 
the relationship between demand elasticity and physician advertising. 

Finally, two variables are included to control for differences in the 
practice environment: the physician’s urban or rural location and 
average physician earnings from 1981 to 1985 in the state in which 
the physician practices. 21 

Table 2 shows the results of the probit regression. A number of the 
explanatory variables are highly significant. 22 Physicians who practice 
outside their state of medical school graduation are more likely to 
advertise, at the 1 percent level of significance. As expected, group 
practice physicians, females, and FMGs are significantly more likely 
to advertise, while older physicians are less likely to advertise. The 


21 As a check on the robustness of the results reported below, we also estimated the 
model defined by eqq. (2)-(4) including a variety of additional variables to control for 
variations in market conditions (i.e., per capita physicians, area income, age composi¬ 
tion of the population, penetration by health maintenance organizations, etc.) and 
dummy variables to control for regional effects. Including these additional variables 
had very little effect on the results reported below and added little explanatory power 
to either the advertising or earnings regressions; hence, they were omitted from the 
final empirical estimates. 

22 Folland (1987) examined the determinants of physician advertising using a sample 
of about 350 physicians from Pennsylvania. Because of his small sample size, his deter¬ 
minants have considerably less explanatory power than the results obtained here. Fol¬ 
land did demonstrate a significant negative relationship between years of practice expe¬ 
rience and propensity to advertise, however. Our results on the relationship between 
advertising and physician and practice characteristics are similar to those reported by 
Rizzo (1988) in a comment on the Folland paper. However, Rizzo did not test for 
potential relationships between market characteristics and the advertising decision, nor 
did his analysis examine the competitive implications of advertising. 
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TABLE 2 

Probit Regression for the Determinants of 
Physician Advertising ( N = 1,995) 

Dependent Variable = ADVERT 

Independent 

Variable 


MOVED 

EXP/100 

FEMALE 

FMG 

BDCERT 

GROUP 

CORP 

AVGINC 

URBAN 

I MED 

SURGS 

OBGYN 

PED 

PSYCH 

OTHER 


.28*** 

(3.56) 

-2.76*** 

( 8 . 20 ) 

(2.73) 

17 * 

(1.90) 

-.11 

(1.42) 

.29*** 

(3.99) 
-.09 

(1.32) 
.83** 

( 2 . 11 ) 

-.12 

(1.33) 

- .38*** 
(3.54) 

- .38*** 
(3.67) 

- .47*** 

(3.27) 

-.45*** 

(3.04) 

-.47*** 

(2.99) 

- .42*** 
(3.36) 


• Statistically significant at the 10 percent level 
•• Statistically significant at the 5 percent level 
Statistically significant at the l percent level 


coefficient on the board certification status variable is negative but not 
statistically significant. 

All specialists are significantly less likely to advertise than general/ 
family practitioners. Specialists may rely more heavily on referrals 
than generalists, do and hence have less need for advertising to pro- 
fepire patients. The primary-care specialties also differ in the propen¬ 
sity to advertise, with pediatricians and internal medicine specialists 
significantly less likely to advertise than general/family practitioners. 
In other words, thesef'esults provide little basis for inferences about 
the.f&ladonship between demand elasticity and the propensity to ad¬ 
vertise. 
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TABLE 3 

Effects of Advertising on Phvsician Earnings (N = 1,995) 


Dependent Variable = In Y 


Independent 

Variable 

ADVERT = 1 
(N = 392, 

R 2 = .31) 

ADVERT « 0 
(A ! - 1,603, 

R 2 - .31) 

SLFSLC 

.31 

-.30 


(.93) 

(1.20) 

EXP/100 

4.99*** 

3.27*** 


(3.44) 

(5.07) 

(EXP/I 00)“ 

-9.64*** 

-7.83*** 


(3.01) 

(7.64) 

FEMALE 

-.56*** 

- .28*** 


(4.47) 

(4.00) 

FMG 

-.17* 

.02 


(1-71) 

(.35) 

BDCERT 

.09 

.20*** 


(LSI) 

(5.72) 

GROUP 

.16* 

.23*** 


(1.79) 

(5.76) 

GORP 

.18*** 

.15*** 


(2.68) 

(5.17) 

In AVGINC 

.73** 

.55*** 


(2.49) 

(4.12) 

URBAN 

.05 

.04 


(58) 

(.95) 

IMED 

,40*** 

.17*** 


(3.39) 

(2.91) 

SURGS 

.66*** 

.42*** 


(5.50) 

(7.55) 

OBGYN 

.67*** 

.33*** 

1 

(4.02) 

(4.67) 

PED 

.21 

-.02 


(121) 

(•26) 

PSYCH 

.36** 

.16** 


(2.16) 

(2.22) 

OTHER 

.61*** 

.39*** 


(4.23) 

(6.00) 


* Statistically significant at the 10 percent level 
*• Statistically significant at the 5 percent level. 
••• Statistically significant at the 1 perteni level 


Average physician earnings (AVGINC) are directly related to ad¬ 
vertising propensity. The explanation for this result is not readily 
apparent and may depend on a host of factors, including differences 
in locational preferences between advertisers and nonadvertisers (i.e., 
the former may prefer to locate in areas in which earnings opportuni¬ 
ties are greater). 

Table S presents estimates of physician earnings corrected for self¬ 
selection. The dependent variable is the natural logarithm of the 
physician’s net annual earnings and is obtained from the SMS data. ' 
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The self-selection variable is defined as 


SLFSLC = 


Physician advertises 
1 ~ *h e ith physician does 


l not advertise, 


where = g + h • X, + j • Z, estimated from equation (4), s„ is 
the standard deviation of u, /•'('¥,/s u ) is the standard normal cumula¬ 
tive distribution function estimated from the probit regression, and 
/('P,/s u ) is the probability density function estimated from the probit 
regression. 

The estimated coefficient on SLFSLC in the advertising cohort is 
statistically insignificant. This suggests that observed earnings pat¬ 
terns of physicians who advertise do not differ significantly from 
those that would be observed if nonadvertisers with the same mea¬ 
sured characteristics as current advertisers had chosen to advertise. 
Apparently, advertisers do not have any hidden advantage in their 
ability to raise their earnings through advertising. The coefficient on 
SLFSLC for nonadvertisers also indicates no significant selection bias. 
That is, observed earnings among nonadvertisers do not differ sig¬ 
nificantly from the patterns expected if advertisers with the same 
measured characteristics had chosen not to advertise. This suggests 
that physicians who refrain from advertising may do so for reasons 
other than financial considerations. 

The coefficients on the physician’s sex, FMG status, and yea^s of 
experience are consistent with the notion that physician advertising 
tends to inhibit entry. For example, in the nonadvertising group, 
earnings rise less steeply with experience than in the advertising 
group. 23 This suggests that physician advertising does not act as a 
substitute for experience but is complementary to experience. 21 

Among physicians who advertise, the relative earnings of female 


s!l A caveat is in order, however. Although the relative earnings ot entrants decline in 
the advertising cohort, advertising could eventually yield benefits to entrants by helping 
them to become better known to both potential customers and their more established 
peers (on the other hand, advertising may alienate fellow physicians) We are grateful 
to the editor for pointing this out. 

2 '* The welfare implications of this result are unclear. On the one hand, consumers 
may pick more experienced physicians because they believe experience raises quality. 
Since advertising may make it easier for consumers to locate this preferred physician 
type, it would seem to improve welfare. However, even if consumers are correct in 
believing that more expertMKed physicians offer higher quality on average, they may 
overestimate the quality differential. If so, advertising may lead consumers to favor 
more experienced physicians much more than they would if they were fully informed 
about ’qfMKty. 
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physicians and FMGs fall markedly. 25 This suggests that the rates of 
return to advertising are substantially lower for female and FMG 
physicians than for males and non-FMGs, respectively. 26 Therefore, 
introducing an advertising regime gives experienced physicians an 
added potential advantage. But within such a regime, entrants, as 
individuals, may find that they can increase their incomes through 
advertising. (Though a rules change may help one group—in this 
case incumbents—more than another, individuals within each group 
should respond in their own self-interest, given the new rules.) Since 
there is no strong evidence suggesting that a physician’s sex or FMG 
status is related to quality of care, 27 the substantially lower rates of 
return to advertising by females and FMGs may reflect consumer 
preferences, ignorance, or, more disturbing, prejudice. 

Uncertified physicians fare better than board-certified physicians 
under advertising. The implications of this result for the relationship 


2: ' We lesled differences between the advertising and nonadvertising regressions in 
the estimated coefficients for SEX. FMG. EXE. and EXP 2 for statistical significance. 
Specifically, we pooled the advertising and nonadvertising cohorts, adding interaction 
terms to allow fot the coefficients on the explanatory variables and the intercept term to 
vaty across these cohorts (e.g., we interacted the variable SEX with a dummy variable 
equal to one if the physician was not an advertiser and zero otherwise). Including these 
interaction terms introduced a high degree of correlation among the regressors. Fui- 
thertnore, the estimated coefficients on the interaction terms were quite sensitive to 
small changes in the empirical specification. Thereloie, the results of this test must be 
viewed with some skepticism since multicollinearity may pose a serious problem in the 
pooled regression. Multicollinearity could lead to statistically insignificant test results 
when the actual differences in coefficients are in lact significant. In discussing tests for 
the statistical significance of coefficients across regressions, Maddala (1977, p. 199) 
notes that “if there is a high degree of multicollinearity in the regressors, it is not 
unusual that what look to us like drastic differences in the coefficients turn out to be 
‘statistically insignificant.' In such cases one should . . . not gel loo excited about having 
found the differences ‘statistically insignificant,’ because from the practical point of 
view these differences are often very significant.’ ” We did find, however, that the 
differences for females and FMGs across the Iwo cohorts were statistically significant 
While the pooled regression estimates indicated a sleeper experience-earnings prohle 
in the advertising cohort, this result was not statistically significant. (This is not entirely 
surprising because the experience variables were highly correlated with the experience 
variables interacted with the cohort dummy.) 

26 Consumers' reactions to FMG. advertisements probably depend more on the read¬ 
ily observed characteristics of FMGs (i.e., whether or not they arc foreign-born) than on 
an assessment of the quality of foreign medical schools (since such information is not as 
readily available to the consumer). 

27 The lack of academic studies on male-female physician quality differentials proba¬ 
bly reflects the lack’of even casual evidence to suggest that there is a significant differ¬ 
ence. By contrast, the issue of quality differentials between U.S. medical graduates and 
FMGs has received considerable attention, with mixed results. Several studies found 
that there are no significant differences in physician performance relating to FMG 
status. However, other studies that have focused on proxy measures for quality (i.e., 
board certification status, licensure status, performance on licensing exams, etc.) have 
concluded that FMGs offer inferior quality of care. See Rhee et al. (1986) for some 
recent evidence and references to earlier studies of this issue. 
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between advertising and entry are unclear. On the one hand, it might 
be argued that referral networks unjustly discriminate against 
uncertified physicians and that advertising helps break down the mo¬ 
nopoly power of these networks, thus promoting entry. 

An alternative view is that the earnings advantage enjoyed by 
certified physicians is justified because they provide higher-quality 
care than uncertified physicians, and this is why they are treated 
favorably by referral networks. In this case, the improved relative 
earnings of uncertified physicians under advertising may result from 
consumer ignorance about differences in physicians’ certification 
status or the implications of these differences for quality of care. 28 
Advertising may appear to have increased entry, but only because 
consumers lack adequate information to make informed decisions 
about physicians. With better information, this apparent increase in 
entry may disappear. 

Earnings of group practice physicians decline relative to those of 
solo practitioners when both types of physicians advertise. Group 
practice physicians may have stronger referral networks (and hence 
less excess capacity) than solo practitioners and stand to gain less from 
advertising on that account. An alternative possibility is that consum¬ 
ers are more responsive to the advertisements of solo practitioners. 
The entry implications of this result are also unclear. On the one 
hand, solo practice is the traditional form. This suggests that the 
improved performance of solo practitioners under advertising is de¬ 
terring the entry of less traditional forms of medical practice. On the 
other hand, group practitioners earn significantly more than solos in 
the absence of advertising. To that extent, the improved performance 
of solos might be construed as a pro-entry advertising effect. £ince 
group practice is becoming increasingly popular, however, the impor¬ 
tance of the entry implications of this result is apt to decline over time. 

Earnings for incorporated practices are significantly higher in both 
the advertising and nonadvertising cohorts. Similarly, earnings are 
higher in states with higher average physician earnings. 

Specialists, particularly those in obstetrics/gynecology and internal 
medicine, appear to fare better than generalists under advertising. 
Possibly specialists earn a higher return from advertising because they 
are filling a larger gap in the consumer’s information than generalists, 
whose services may already be well understood by the consumer. 


88 Reade and Ratzan (1987) note that physicians listed as specialists in Yellow Pages 
directories are not distinguished by board certification status, e.g. Furthermore, while a 
board-certified physician tpighl make his or her certification status known in an adver¬ 
tisement, a physician who is not certified is unlikely to highlight this fact. Uncertified 
physicians are more likely simply to list their specialty, perhaps citing professional 
organizations to which they belong, previous achievements, and so on. 
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Given their higher returns, it is somewhat puzzling that specialists are 
uniformly less likely to advertise than generalists. It may be that spe¬ 
cialists, who depend on referrals for a substantial part of their 
caseloads, are reluctant to engage in activities (such as advertising) 
that many physicians in the referral network may disapprove of. 

Paradoxically, although the returns to advertising are directly re¬ 
lated to years of practice experience (EXP), this variable is correlated 
with a lower propensity to advertise. Conversely, while advertising 
seems to offer relatively less additional income to women and FMGs, 
they are more likely to advertise. Furthermore, the magnitude of the 
earnings differential between entrants and incumbents may increase 
substantially in the advertising cohort. For example, using the esti¬ 
mated coefficients from the regressions reported in table 3, purged of 
selection effects, and mean values for the variables from the entire 
sample, we find that, during the first 20 years of practice, the age- 
earnings profile grows at an average annual rate of about 2 percent in 
the nonadvertising cohort. In the advertising cohort, the growth rate 
is 4 percent. 29 

Recent survey results reveal some interesting age and sex differ¬ 
ences in physicians’ evaluation of advertising effectiveness (Powills 
1987). In particular, male physicians over 50 years of age were most 
receptive to the notion that advertising has been effective in inform¬ 
ing the public about hospital services. Such sentiments were far less 
prevalent among physicians under the age of 30. If one believes that 
physicians are more likely to view ads as being eff ective if they benefit 
personally from advertising, then these results are consistent with our 
findings' that older, male physicians benefit more from advertising 
than their younger, female counterparts. 

That experienced physicians tend to avoid advertising, even 
though as a group they would benefit most from a regime in which 
advertising was well established, suggests that these physicians may 
be particularly concerned about the potential negative connotations 
of advertising. Indeed, this somewhat peculiar equilibrium can be 
understood in terms of an adverse selection model. To simplify, as¬ 
sume that the distributions of quality in the cohorts of established 
and new doctors are the same. New doctors, however, have a much 
weaker ethical inhibition against advertising. Among established phy¬ 
sicians, only the most aggressive and those most in need of patients 
will advertise. 80 Advertising will have a strong negative overtone, 

39 These results are meant to be illustrative. The age-earnings profile is steeper in the 
advertising cohort for all plausible time horizons. 

30 The reader, presumably an economist, might inquire whether he or she would 
advertise, and if not, why not. The answer is likely to be the same as that for physicians: 
Few peers do it, many would look down on it, and it sends a bad signal. 
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which may be felt by and may deter some physicians who would 
otherwise like to advertise. Among physician entrants, by contrast, if 
advertising is relatively acceptable, even high-quality physicians may 
advertise; the adverse implications will be less strong. 

These considerations suggest two reasons why new physicians may 
be more likely to advertise even though they secure a lower return 
from it. First, with fewer years of exposure to the antiadvertising 
ethic, they may not have internalized a strong distaste for the practice. 
Second, given greater advertising among such physicians, a less nega¬ 
tive signal is sent when they do advertise. In addition, for a variety of 
utility functions, new physicians—though they derive less benefit— 
may advertise more because their incomes are lower or more vari¬ 
able. 31 

Over time, of course, as the current generation of older physicians 
age and retire, advertising will become more widespread and less easy 
to interpret as a negative signal among older physicians, thus tempt¬ 
ing some older physicians to try the new methods. If there are no 
other changes, the long-run equilibrium will impose equal inhibitions 
on advertising across ages. 

VI. Conclusion 

This paper has examined the impact of advertising on entry in the 
market for physician services. The results differ sharply from earlier 
findings, which have tended to indicate that (1) entry-deterring conse¬ 
quences of advertising are conlined to industries characterized by 
heavy advertising (Comanor and Wilson 1979), and (2) for the strong 
majority of manufacturing industries, advertising promotes entry 
(Kessides 1986). 

Physician advertising acts as a complement to experience, not a 
substitute. The returns to advertising are substantially lower for fe¬ 
male physicians and FMGs (entrants) than for males and non-FMGs, 
respectively. In other words, an equilibrium in which inhibitions to 
advertising melt away will not improve the relative financial status of 
less-well-established physicians. Nevertheless, females, FMGs, and 
less experienced physicians, following their incentives as individuals, 
are all more likely to advertise than males, non-FMGs, and more 
experienced physicians. 

Factors beyond income maximization appear to inhibit advertising, 
such as social pressures, ethical sentiment, or a perception that physi- 


31 Advertising may also reduce the variability of income. Under the assumption of 
decreasing risk aversion, poorer individuals will pay more—in money expectation or 
advertising distaste—for the same absolute shrinkage in income variability. 
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cians who advertise are of lower quality than others. More experi¬ 
enced physicians, males, and U.S. medical graduates may attach 
greater importance to these inhibiting factors, in part because their 
earnings, and possibly perceived quality, tend to be higher even if 
they do not advertise, but also, in the case of older physicians, because 
they have lived for many years under regimes that prohibited or 
strongly frowned on advertising. (Our tests for selection bias revealed 
that nonadvertisers do not appear to have an advantage in achieving 
earnings without advertising, which is consistent with the notion that 
nonfinancial considerations inhibit physicians from advertising.) 

To the extent that physician advertising has inhibited entry, it has 
had an anticompetitive effect in this market. As Comanor and Wilson 
(1979, p. 472) have noted, however, anticompetitive consequences 
from advertising pose a difficult question of public policy: “the simple 
finding that an anticompetitive effect exists is not sufficient to imply 
that policy actions are required. For example, to the extent that con¬ 
sumer information is increased in the same process that monopoly 
power is attained, we may be unwilling to adopt specific policy mea¬ 
sures directed against the latter for fear of adversely affecting the 
former as well.” 

A prudent course of action for policymakers would be to explore 
options for retaining the benefits from advertising while mitigating 
the costs. The present problems with physician advertising may stem 
from consumers’ inability to use information effectively. Uncertain 
how to judge quality, consumers may be relying on poor indicators 
such as the physician’s sex and FMG status/-’ 

Our results have three important implications. First, we observe 
that the decision whether or not to advertise is not merely a choice on 
how to maximize income. Ethics, norms, and social inhibitions appear 
to matter. Moreover, because such factors play a role, even physicians 
with no personal aversions to advertising must worry about the impli¬ 
cations the practice conveys. Second, we find that the market for 
physician advertising is still progressing toward equilibrium. From an 
analytic standpoint, it is reassuring that an econometric investigation 
can predict properties at the equilibrium that may be quite different 
from what is observed today. From a policy standpoint, it appears, the 
FTC’s strong interventions in favor of physician advertising may have 

,1! Survey data indicate that most physicians believe that advertising will not enable 
the consumer to make better-informed physician choices, As Folland (1987, p. 315) 
reports, “Although most . . . physicians approve of advertising in general . . . they 
largely disapprove of consumer advertising by physicians on several grounds. First, 
physicians find advertising to be an unsuitable means of communicating medical infor¬ 
mation to the consumer. A large majority agrees that 'it is difficult to advertise compe¬ 
tence and quality of care in my profession.’ ... In sum, advertising will not help 
consumers make more intelligent choices among physicians.” 
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promoted entry and competition in the short run, while established 
physicians remained hesitant to advertise. Eventually, however, 
through the aging of the population and the breakdown of norms, 
physicians falling into this group will begin to advertise. Unless adver¬ 
tising yields benefits to entrants over time that we have not been able 
to measure (i.e., by increasing their visibility among potential patients 
and peers), more established physicians will gain at the expense of 
their less established peers. Competition will be diminished. If this 
proves to be the case, it is unlikely that any government agency will be 
able to reestablish an antiadvertising ethic. Third, we have added an 
important case study to the advertising and competition debate. In a 
market with complex information conditions, we have shown, adver¬ 
tising may inhibit rather than promote competition. 


Appendix 

Variable Names and Descriptions 11 


ADVERT 

In Y 

MOVED 


EXP/100 

(EXP/100) 2 

FEMALE 

FMG 

BDCERT 

GROUP 

CORP 

AVG1NC 

In AVGINC 

URBAN 


Dummy variable that equals one if physician advertised prac¬ 
tice by newspaper, magazine, television, and/or radio at 
any time during 1981-86; equals zero otherwise 
Natural logarithm of physician’s annual net earnings in 1986 
Dummy variable that equals one if physician not practicing 
in state in which attended medical school; equals zero 
otherwise 

Years of practice experience divided by 100 
Years of practice experience squared 
Dummy variable that equals one if physician is female; 
equals zero otherwise 

Dummy variable that equals one if physician is a foreign 
medical graduate; equals zero otherwise 
Dummy variable that equals one if physician is board certi¬ 
fied; equals zero otherwise 

Dummy variable that equals one if physician’s practice is a 
group practice; equals zero otherwise 
Dummy variable that equals one if physician’s practice is in¬ 
corporated; equals zero otherwise 
Average annual physician earnings in state in which physi¬ 
cian resides, 1981-85 

Natural logarithm of average annual physician earnings in 
state in which physician resides, 1981-85 
Dummy variable that equals one if physician is located in 
county having more than 1 million inhabitants; equals .5 if 
in county having 506,000-999,000 inhabitants; equals zero 
otherwise 


** Continuodfvariables are normalized to lie between zero and one. All variables are 
either drawn directly or constructed from the AMA’s SMS or the AMA’s Physician 
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IMED Dummy variable that equals one if physician is a specialist in 

internal medicine; equals zero otherwise 

SURGS Dummy variable that equals one if physician is a general sur¬ 

geon; equals zero otherwise 

OBGYN Dummy variable that equals one if physician specializes in 
obstetrics/gynecology; equals zero otherwise 

FED Dummy variable that equals one if physician is a pediatri¬ 

cian; equals zero otherwise 

PSYCH Dummy variable that equals one if physician is a psychiatrist; 
equals zero otherwise 

OTHER Dummy variable that equals one if physician is in specialty 
other than those mentioned above; equals zero otherwise 

SLFSLC Variable measuring selection effects (see text for definition) 


References 

AMA Judicial Council. “Statement of the Judicial Council Re: Advertising 
and Solicitation.” J. American Medical Assoc. 235 (May 24, 1976): 2328. 

Barney, Daniel R. “Health Care Advertising: What You Can and Cannot 
Say.” Health Care Management 2 (January 1985): 3-10. 

Borjas, George J. “Self-Selection and the Earnings of Immigrants." A.E.R. 77 
(September 1987): 531-53. 

Burfield, W. Bradley; Hough, Douglas E.; and Marder, William D. "Location 
of Medical Education and Choice of Location of Practice."/. Medical Educa¬ 
tion 61 (July 1986): 545-54. 

Chamberlin, Edward H. The Theory of Monopolistic Competition: A Re- 
Onentation of the Theory of Value. 8th ed. Cambridge, Mass.: Harvard Univ. 
Press, 1962. 

Comanor, William S., and Wilson, Thomas A. “The Effect of Advertising on 
Competition: A Survey."/. Econ. Literature 17 (June 1979): 453-76. 

Folland, Sherman T. "Advertising by Physicians: Behavior and Attitudes.” 
Medical Care 25 (April 1987): 311—26. 

Gray, James. “The Selling of Medicine, 1986." Medical Econ. (January 20, 
1986), pp. 180-94. 

Health Care Financing Administration. Division of National Cost Estimates, 
Office of the Actuary. “National Health Expenditures, 1986-2000." Health 
Care Financing Rev. 8 (Summer 1987): 1-36. 

Heckman, James J. “Sample Selection Bias as a Specification Error." Economet- 
rica 47 (January 1979): 153—61. 

Katz, Eliakim, and Stark, Oded. "Labor Migration and Risk Aversion in Less 
Developed Countries.”/. Labor Econ. 4 (January 1986): 134-49. 

Kessides, Ioannis N. “Advertising, Sunk Costs, and Barriers to Entry.” Rev. 
Econ. and Statis. 68 (February 1986): 84—95. 

Kletke, Phillip R.; Marder, William D.; and Silberger, Anne. The Demographics 
of Physician Supply: Trends and Projections. Chicago: Center Health Policy 
and Res., American Medical Assoc., 1987. 

Leake, Chauncey D., ed. Percival's Medical Ethics. Huntington, N.Y.: Krieger, 
1975. 

Lee, Lung-Fei. "Unionism and Wage Rates: A Simultaneous Equations Model 
with Qualitative and Limited Dependent Variables." Internal. Econ. Rev. 19 
(June 1978): 415-33. 



5 °° 


JOURNAL OF POLITICAL ECONOMY 

Leffler, Keith B. “Persuasion or Information? The Economics of Prescription 
Drug Advertising." J. Law and Econ. 24 (April 1981): 45-74. 

Maddala, G. S. Econometrics. New York: McGraw-Hill, 1977. 

-. Limited-Dependent and Qualitative Variables in Econometrics. New York: 

Cambridge Univ, Press, 1983. 

Pauly, Mark V., and Satterthwaite, Mark A. "The Pricing of Primary Care 
Physicians’ Services: A Test of the Role of Consumer Information.” Bell J. 
Econ. 12 (Autumn 1981): 488-506. 

Pecarski, Loraine. “Supreme Court Upholds FTC in Doctor Advertising 
Case." U.S. Health Dollar 12 (April 2, 1982): 5. 

Powills, Suzanne. "Half of Physicians Say Ads Are Effective." Hospitals 61 
(July 20, 1987): 40. 

Reade, Julia M., and Ratzan, Richard M. “Yellow Professionalism: Advertis¬ 
ing by Physicians in the Yellow Pages.” New England J. Medicine 316 (May 
21, 1987): 1315-19. 

Rhee, Sang-O; Lyons, Thomas F.; Payne, Beverly C.; and Moskowilz, Samuel 
E. “USMGs versus FMGs: Are There Performance Differences in the Am¬ 
bulatory Care Setting?” Medical Care 24 (March 1986): 248—58. 

Rizzo, John A. "Physician Advertising Revisited.” Medical Care 26 (December 
1988): 1238-44. 

Stark, Oded, and Taylor, J. Edward. “Relative Deprivation and International 
Migration." Discussion Paper no. 36. Cambridge, Mass.: Harvard Univ., 
Center Population Studies, February 1988. 

Willis, Robert J., and Rosen, Sherwin. “Education and Self-Selection. "J.P.E. 
87, no. 5, pt. 2 (October 1979): S7-S36. 



International Evidence on the Size of the 
Random Walk in Output 


Timothy Cogley 

University of Washington 


This paper contributes three extensions of Cochrane’s work on 
measuring the relative stability of long-term growth. It estimates 
variance ratios for nine OECD countries over the period 1871-1985, 
presents an improved approximation to the distribution of the vari¬ 
ance ratio, and considers the comovements of long growth cycles 
across countries. The evidence indicates that the relative stability of 
long-term growth found by Cochrane is unique to the United States. 
Relative to the United States, most countries have more variable 
dynamics at low frequencies and smoother dynamics at frequencies 
traditionally associated with business cycles. 


A number of recent papers have reconsidered the traditional view 
that long-term growth is stable relative to short-term growth. Build¬ 
ing on work by Nelson and Plosser (1982), several authors have ar¬ 
gued that a substantial portion of output fluctuations are permanent. 
For example, Campbell and Mankiw (1987) find that U.S. post- 
World War II quarterly gross national product shows essentially no 
tendency to revert to its trend level after a disturbance. In fact, their 
estimates suggest that shocks to GNP are amplified, so that long-run 
variability is greater than short-run variability. 

Cochrane (1988), on the other hand, presents evidence that there is 
a considerable degree of trend reversion in output and that it occurs 
over a time horizon of many years. He argues that parsimonious time- 
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series models attach too much weight to short-term dynamics and too 
little weight to long-term dynamics, so that they incorrectly measure 
the variability of long-term growth. As an alternative, he suggests 
measuring the relative stability of long-term growth by a variance 
ratio statistic, which is proportional to the variance of cumulative 
growth over a horizon of many years divided by the variance of 1-year 
growth, 

v = A' 1 var[y(Q - y(t - A)] 
var[y(<) - y(t - 1)] 

where y(t) represents the natural logarithm of GNP. 

For annual real per capita U.S. GNP, 1869-1986, Gochrane esti¬ 
mates that the variance ratio is roughly one-third. His results indicate 
that U.S. real growth rates over a horizon of a decade or more are in 
fact stable relative to yearly growth rates. Unfortunately, the variance 
ratio is measured very imprecisely. On the one hand, Gochrane ar¬ 
gues that one cannot reject the hypothesis that the variance ratio is 
zero. This implies that all fluctuations are transitory and that long¬ 
term growth can be represented as a deterministic trend. On the 
other hand, his evidence is also consistent with rather large variance 
ratios, on the order of two-thirds. Because of this imprecision, Coch¬ 
rane concludes that the relative stability or instability of long-term 
growth is not a well-established stylized fact that models should seek 
to replicate. 

This paper contributes three extensions of Cochrane’s work. One 
extension is to consider whether his estimate is common to many 
countries or particular to the United States. If output dynamics are all 
alike, one might expect that many countries would have variance 
ratios on the order of one-third. The data analyzed in this paper 
include real per capita gross domestic product, 1870-1985, for Aus¬ 
tralia, Canada, Denmark, France, Norway, Sweden, and the United 
Kingdom, as well as real per capita GDP and GNP for the United 
States. Campbell and Mankiw (1988) and Clark (1989) also present 
international evidence on this issue, but their studies are based on 
post-World War II quarterly data. Since 40 years of data contain little 
independent information about long-term dynamics, it seems worth¬ 
while to consider evidence based on a longer time span. 1 

The second extension concerns the asymptotic distribution of the 
variance ratio statistic. Cochrane relies on a normal limiting distribu¬ 
tion, but Monte Carlo experiments indicate that there is a high degree 
of skewness in the empirical distribution of the variance ratio. In 

1 Kormendi and Meguire (1988) also provide international evidence based on a long 
time span of data. 
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particular, the lower confidence bound based on the normal is much 
too low, which makes it too hard to reject small values of the variance 
ratio. There is, however, an asymptotically equivalent frequency do¬ 
main estimator whose limiting distribution can be approximated by a 
multiple of a chi-square variate. Monte Carlo experiments indicate 
that the chi-square approximation is a considerable improvement 
over the normal and that, in particular, it avoids making gross errors 
in the lower confidence bound. 

The results of this paper show that the relative stability of long¬ 
term growth found by Cochrane is unique to the United States. The 
United States has the smallest variance ratio among all the countries 
sampled. For countries other than the United States, the simple aver¬ 
age of the point estimates is 1.16, as compared with Cochrane’s esti¬ 
mate of one-third. For all the countries except for the United States 
and Canada, the 5 percent lower probability bounds are larger than 
Cochrane’s point estimate. A small variance ratio seems to be the 
exception rather than the rule. 

The paper also examines the covariation of long-term growth 
across countries to see whether long growth cycles are highly cor¬ 
related across countries or are idiosyncratic. It turns out that long¬ 
term growth rates are more highly correlated than 1-year growth 
rates. In fact, there is some evidence that levels of per capita output in 
the various countries are cointegrated. This implies that if levels of 
per capita output diverge by too much, there are forces that tend to 
pull them back together. 

The rest of the paper is organized as follows. Section I motivates 
the variance ratio statistic. Section II discusses two approximations to 
the distribution of the variance ratio. Section III presents cross¬ 
country estimates for the variance ratio and accounts for the cross¬ 
country differences. Section IV examines the covariation of long 
growth cycles. A brief summary (Sec. V) concludes the paper. 

I. Interpreting the Variance Ratio 

The variance ratio is proportional to the variance of cumulative 
growth over a A-year horizon divided by the variance of 1-year 
growth. Assume that output growth is a covariance stationary process, 
and let Cyy(j) denote the jth autocovariance of 1-year growth. The 
variance of 1-year growth is c^(0), while the variance of A-year growth 
is 

var[y(<) - y(t - A)] = kc v (Q) + 2 (A - ;>„(>), 

i -1 

where y(t) is the natural logarithm of output. 


( 1 ) 
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Suppose that over the long run the economy tends to damp distur¬ 
bances, so that long-term growth is stable relative to short-term 
growth. In this case, a positive shock is likely to be followed by a 
period of lower than average growth, which partially offsets the initial 
disturbance. This implies that the second term on the right-hand side 
of equation (1) is negative, and thus the variance ratio is less than one. 
If over the long run the economy tends to amplify disturbances, so 
that long-term growth is unstable relative to short-term growth, then 
a positive shock is likely to be followed by a period of higher than 
average growth. The second term in equation (1) is positive, and the 
variance ratio is greater than one. Thus the variance ratio provides a 
simple way to characterize the relative stability of long-term growth. 

One can also relate the variance ratio to models that divide output 
into permanent and transitory components (see, e.g., Beveridge and 
Nelson 1981; Harvey 1985; Watson 1986; Clark 1987). Let us make 
two more assumptions: (1) that the permanent component is a ran¬ 
dom walk with drift and (2) that the temporary component vanishes 
in the long run. The first assumption is motivated by the habit of 
thinking of the permanent component as a stochastic trend. The sec¬ 
ond assumption defines “temporary.” In this case, Cochrane showed 
that as k grows large the variance ratio converges to the f raction of the 
variance of output growth that is due to the random walk component. 
Thus the variance ratio also provides a convenient measure of the 
importance of the random walk component in output. 

II. Approximating the Distribution of the 
Variance Ratio 

As a preliminary to making cross-country comparisons, it is necessary 
to reconsider the limiting distribution of the variance ratio statistic. 
Cochrane uses the following estimator for the variance ratio: 

v k = var i y(<) ~ y (t ~ w i 1 \ 

var[y(f) - y(t - 1)] \T - k + 1/ 

The second term is a bias correction factor. Anderson (1971) shows 
that as T grows large, k grows large, and k/T grows small, this es¬ 
timator has a limiting normal distribution: 



The term T/k is the number of nonoverlapping A-year periods in 
the sample. The normal approximation will be accurate for samples 
that contain a large number of A-year periods. Unfortunately, our 
samples contain only a few. In Cochrane’s sample, there are 118 years 
of data, and the relevant values of k range from 15 to 30. Since T/k is 
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small in the available samples, the limiting normal approximation is 
unlikely to provide a good approximation. 

A series of Monte Carlo experiments was conducted in order to 
evaluate the accuracy of the normal approximation. Three processes 
were simulated: a random walk with drift, for which the variance ratio 
is equal to one; an AR1MA(0, 1, 1) calibrated so that the variance ratio 
is 1.8; and another ARIMA(0, 1, 1) calibrated so that the variance 
ratio is 0.2. The statistic V* was calculated for k = 15, and the process 
was replicated 1,000 times. Table 1 shows the results of the three 
experiments. The Monte Carlo evidence indicates a high degree of 
skewness in the empirical distribution of V*. In particular, the lower 
tail of the normal is much too fat. To state a dramatic example, in the 
three experiments, V h was never as many as two standard errors below 
the mean. Thus the lower tail of a normal confidence interval will 
include values of the variance ratio that the empirical distribution 
reveals to be unusually small. Since the lower critical values are too 
small, the normal approximation makes it too hard to reject small 
values of the variance ratio. 2 

Fortunately, there is an asymptotically equivalent frequency do¬ 
main estimator whose limiting distribution provides a better approxi¬ 
mation for samples of this size. As k grows large, the numerator of the 
variance ratio converges to 2ir times the spectrum of output growth 
evaluated at frequency zero. Thus for large k we can write the vari¬ 
ance ratio as 


V = 


2-tt 


/,>(») 
C,(0) ’ 


where /"„(«) represents the spectrum evaluated at frequency cv. A 
standard frequency domain estimator for the numerator is obtained 
by smoothing the first few periodogram ordinates: 


2 it/£(0 ) = 2ir X W 7 (<n,)/ 7 v ((n ; ), 
/-• 


= j = 1, .... T - 1, 


/£(«,) = (2irT)- , |4(a» / )p, 

T— 1 

= X [A>(<) - |i]exp(-tti> ; <), 
1= 1 


* See Kim, Nelson, and Startz (1988) and Lo and MacKinlay (1989) for more exten¬ 
sive Monte Carlo evidence. In all cases in which Tlk is small, Monte Carlo experiments 
reveal a dramatic departure from normality in the lower tail of the empirical distribu¬ 
tion. Faust (1988) finds the small sample distribution for V* for the case in which the 
innovation is an identically and independently distributed normal random variable. 
Using numerical integration, he confirms that the lower tail of the normal is too fat. 
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TABLE 1 

Evaluating the Limiting Distributions of V* and 




Percentiles of the Monte Carlo Distribution 


.005 

.025 

.05 

.5 

.95 

.975 

.995 



A. Random Walk with Drift; V = 

1 


y* 

.047 

.073 

.097 

.478 

.983 

.993 

1.00 


.008 

.039 

.075 

.516 

.929 

.975 

.994 



B. ARIMA(0, 1. 1) with Drill; V = 

.2 


y* 

.131 

.185 

.245 

.709 

.996 

.999 

1.00 

w 

.018 

.039 

.066 

.575 

.967 

.985 

.997 



C. 

ARIMA(0, 1,1) with Drift; V = 

1.8 


m 

mam 

.067 

.084 

.399 

.919 

.966 

.996 

m 


.034 


.512 

915 

.942 

.978 


Note —The entries show the theoretical cumulative probability of the observed statistic at various percentiles of 
the empirical distribution for the hypothesized distribution is N\V, (4k/$T)V 2 ) For V^, the hypothesized distribu¬ 
tion is Vgi!/v. 't he theoretical cumulative probability ought to match Ihe percentiles of the empuital distribution 


where /^(w ; ) denotes the periodogram, d*( id 7 ) is ihe discrete Fourier 
transform, and ji is an estimate of the mean of A y(t). The term W 7 ^) 
is a weight function that is concentrated near frequency zero, trun¬ 
cates at w = 2im/7', and is normalized so that it sums to one: 


W T (w } ) = 


_for j = l, m 

27 -, W& u> ; ) 

0 for j > m. 


Wl(u>,) = (2itA)“ 


1 sin(fcu>,/2) ]' 2 
sin(u> ; /2) ’ 


where Wh(<i>j) is the Bartlett window. For k equal to 15 or 20, setting m 
equal to 10 makes W r (w ; ) a good approximation to the Bartlett win¬ 
dow. Choose a consistent estimator for <^(0). One can estimate the 
variance ratio by 


y/ = 


2tt 


fn( 0) 

C T yy{ 0 ) ' 


The statistics V* and V J differ in small samples but are asymptot¬ 
ically equivalent. The asymptotic properties of W depend on the be¬ 
havior of W T (<t>j) as Xgrows large. If the weights approach a Dirac 
delta function as T grows large, then V f has a limiting normal distribu¬ 
tion. However, given the skewness of the Monte Carlo distribution, 
the normal approximation seems inappropriate. If the weights are 
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assumed to be constant as T grows large, then V f is proportional to a 
weighted sum of independent chi-square variates in large samples. 3 
Since this is a bit cumbersome to work with, one can approximate it by 
a multiple of a chi-square variate: 4 

tf/ ~ v 

v 

where v = 2 / 27 -, W r (u>/. 

A second series of Monte Carlo experiments was conducted to eval¬ 
uate the accuracy of the chi-square approximation to the distribution 
of W. Table 1 shows that the chi-square is a considerable improve¬ 
ment over the normal. In particular, the chi-square approximation 
avoids making gross errors in the lower tail and has about the right 
degree of skewness. Hence, in making cross-country comparisons of 
estimates of the variance ratio, I employ the frequency domain es¬ 
timator and use the chi-square approximation to construct confidence 
intervals. 5 6 


III. International Comparisons 

The data set includes annual real per capita GDP growth, 1871-1985, 
for Australia, Canada, Denmark, France, Italy, Norway, Sweden, and 
the United Kingdom as well as GDP and GNP growth for the United 
States. 0 The data for GDP through 1979 are taken from Maddison 
(1982), and updates through 1985 are taken from various issues of 
the OECD Main Economic Indicators? The data for U.S. per capita 
GNP are derived from GNP data in Gordon (1986) and population 
data in Maddison. The natural logarithm of per capita output for 
each country is shown in figure 1. The long time span of the data set is 
a distinct advantage over studies that rely on post-World War 11 
quarterly data. The more nonoverlapping long periods in the data, 
the more precisely one can measure the variance ratio. 8 

3 This follows from the fact that f 7 „(0) /„(0)27 -1 W' r («i>,)xs0)/2 (Brillinger 1981, 

theorem 5.5.3) and c£(0) E c„(0). 

‘‘An approximate 100-/ percent confidence interval for V is given by (Xv[(l + 
Tf)/2]/v) _ 1 ft s V ^ (x2[( 1 _ "y)/2|/v) " 1 ft. Note that although the mean and variance of 
ft depend on the unknown true value of V, the upper and lower bounds of the 
confidence interval do not. 

5 Since !•■'* and ft are asymptotically equivalent, one could use the chi-square to 
approximate the small sample distribution of V*. However, it turns out that in samples 
of this size the chi-square is a much better approximation to the distribution ft than it is 
to the distribution of ft. 

6 Data are also available for Germany, but I chose to omit it from the sample because 
of drastic changes in its borders. 

7 Maddison’s app. A contains a detailed description of his data sources. 

8 In Campbell and Mankiw (1988), there is only one nonoverlapping period of 20 
years or more. In Clark (1989), there are only two. 
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TABLE 2 


Estimates of the Variance Ratio: Per Capita Output Growth, 1871-1985 



V> 

l?* 

* = 15 

o 

II 

-4C 

h = 15 

© 

<M 

II 

Australia 

1.15 

1.21 

1.25 

1.40 


(.63, 3.2) 

(.64,4.1) 



Canada 

.64 

.64 

72 

.77 


(.35, 1.8) 

(.34, 2.2) 



Denmark 

.92 

.97 

1.00 

1.09 


(.51,2.0) 

(51, 3.3) 



France 

1.57 

1.55 

1.78 

1.84 


(.86,4.4) 

(.82, 4.9) 



Italy 

1.60 

1.80 

1.75 

2.02 


(.88, 4.5) 

(.96,6.1) 



Norway 

1.21 

1.39 

1.24 

1.39 


(.67,3.4) 

(.74, 4 7) 



Sweden 

.90 

.89 

.99 

.97 


(.50, 2 5) 

(.47, 3 0) 



United Kingdom 

.77 

.85 

.94 

1.03 


(.43, 2.2) 

(.45, 2.9) 



United Slates: 





GDP 

.48 

.36 

62 

.51 


(.27, 1.4) 

( 19, 1.2) 



GNP 

.49 

.41 

.60 

.53 


(.27, 1 4) 

( 22. 1.4) 




Sort —Appioximatt* 90 primil tonhdcmu init'ivals ar«* shown in parcnthcsos 


Table 2 shows estimates of the variance ratio for each country. The 
first two columns show for k equal to 15 and 20 and m equal to 10. 
A 90 percent confidence interval based on the chi-square approxima¬ 
tion is shown in parentheses. The last two columns show V h for k equal 
to 15 and 20. The results are not sensitive to either the choice of k or 
the choice of estimator. 

'Fable 2 reveals four interesting facts. First, since the estimates of 
U S. GNP and GDP are essentially the same, comparisons of the vari¬ 
ance ratio based on foreign GDP with Cochrane’s estimate based on 
U S. GNP are not unreasonable. For U.S. GNP, the estimates are a bit 
larger than Cochrane’s. The data used here are taken from Gordon 
and differ a bit from Cochrane’s data, 9 Since the confidence intervals 
include Cochrane’s estimate, any differences that are due to choice of 
data series are not significant. 

Second, the United Stales has the smallest point estimate in the 

9 Friedman and Schwartz (1982) link their early data to net national product in 1947. 
Cochrane links the early data to GNP. Gordon adds a capital consumption adjustment 
to the early Friedman and Schwartz data and links them to post-Worid War 11 GNP. II 
the early Friedman and Schwartz data are more like NNP than GNP, then Gordon's 
data are likely to be more homogeneous over time than Cochrane’s. 
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sample. Aside from the United States, the smallest estimate is for 
Canada, whose variance ratio is 0.64. This is almost double Coch¬ 
rane's estimate for the United States. Four countries have point esti¬ 
mates that are greater than one, and two others have estimates in the 
neighborhood of 0.9. For France and Italy, the estimates are roughly 
1.6, which indicates that the variance of output growth is dominated 
by long-run movements. 

Third, even when one uses the more accurate chi-square approxi¬ 
mation, the confidence intervals remain large. In particular, the up¬ 
per confidence bounds are all quite large. 

Fourth, despite the imprecision of the estimates, one can find inter¬ 
esting lower probability bounds for many of the countries. The prob¬ 
ability that the true value of the variance ratio is smaller than the 
lower bound of the confidence interval is 5 percent. Apart from the 
United States and Canada, the lower probability bounds for the vari¬ 
ance ratio are all larger than Cochrane’s point estimate of one-third. 
The 6 percent lower probability bounds for Sweden and Denmark are 
in the neighborhood of one-half. For Norway and Australia, the 
lower bounds are roughly two-thirds, while for France and Italy the 
lower bounds are approximately 0.9. Even a reading of the evidence 
that favors long-run stability indicates that many other countries have 
variance ratios that are considerably larger than that of the United 
States. 

There are at least two ways to account for the differences in the 
variance ratios. It might be the case that countries that have high 
variance ratios have more variable long-term growth than the United 
States but roughly equally variable business cycles. Alternatively, their 
short-term cycles might be considerably smoother, while their long¬ 
term variability is about the same as that of the United States. As a 
first measure of the contribution of these two factors, one can com¬ 
pare across countries the variance of A-year growth and the variance 
of 1-year growth. 

The first two columns of table 3 report estimates of the numerator 
and denominator of the variance ratio. Most of the countries with 
large variance ratios have more variable long-term growth than the 
United States. For Italy and France, long-term growth is six times 
more variable than for U.S. GDP. In a number of other countries, the 
variability of long-term growth is two to two and one-half times larger 
than in the United States. Thus greater variability in long-term 
growth clearly plays an important role. 

Sweden and the United Kingdom are exceptions. Their long-term 
growth is actually ltfss variable than that of the United States. The 
high variance ratios for Sweden and the United Kingdom are due 
principally to a smaller variance of yearly output growth. While long¬ 
term growth is roughly 20-25 percent less variable, 1-year growth is 
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TABLE 3 

Variance of Components of Growth, 1871-1985 


5 11 





of (Years per Cycle) 

2-5 5-10 10 + 

Australia 

.20 

.24 

.115 

.042 

.043 



(.13, .81) 

(.089, .154) 

(.027, .075) 

(.029, .068) 

Canada 

.40 

.26 

.216 

.097 

.087 



(.14, .88) 

(.168, .289) 

(.063, .173) 

(.058, .149) 

Denmark 

.16 

.155 

.113 

.021 

.026 



(.08, .53) 

(.088, .151) 

(.014, .037) 

(.017, .044) 

France 

.39 

.60 

.164 

.069 

.157 



(.32,2.0) 

(.127, .220) 

(.045, .123) 

(.104, .268) 

Italy 

.36 

.65 

.158 

.091 

111 


(.34, 2.2) 

(.123, .212) 

(.059, .162) 

(.074, .190) 

Norway 

.14 

.19 

.087 

.023 

.030 



(.10. ,64) 

(.068. .117) 

(.015, .036) 

(.02, .051) 

Sweden 

.084 

.075 

.050 

.013 

.021 



(.04, .25) 

(.039, .067) 

(.008, .023) 

(.014, .036) 

United Kingdom 

.096 

.082 

.039 

.032 

.025 


(.04, .28) 

(.030, .052) 

(.021, .057) 

(.017, .043) 

United States: 






GDP 

.28 

.10 

.120 

.097 

.064 



(.05, .34) 

(.093, .161) 

(.063, .173) 

(.042, .109) 

GNP 

.34 

.14 

.142 

.124 

.074 



(.07, .47) 

(.110, .190) 

(.080, .221) 

(.049,. 127) 


Note — a? is ihr variance of first differences of log output, nils the variance of * differences of log output, andtrf 
h the variance of band filtered components of' output growth Approximate 90 percent confidence intervals are 
shown in parentheses The entries in the table arc equal u» 100 X oj. 


approximately 67 percent less variable. It must be the case that Swe¬ 
den and the United Kingdom have considerably smoother dynamics 
at high to medium frequencies. In fact, among the six countries in 
which long-term growth is more variable than in the United States, 
three have less variable annual output growth. This implies that they 
also have less variable high- to medium-frequency components than 
the United States. 

To clarify the contribution of high- and medium-frequency dynam¬ 
ics to the cross-country differences in the variance ratio, one can 
compare the variance of output growth on given frequency bands. 
This can be measured by the following statistic: 



where 

d(w) = 


r l for at E [u»i, 0 * 2 ] 

1 for <0 6 [27r - u> 2 , 2ir - Wi] 
0 otherwise. 
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The term A(w) is the gain of a band pass filter, and (o»i, 0 * 2 ) defines 
its boundaries. To measure the variance of growth cycles whose per¬ 
iods are between j and k years per cycle, set u> 2 = tt//2 radians and w, 
= it A/2 radians. This measure is equivalent to filtering the output 
growth series in order to remove components at all frequencies lower 
than W] and higher than <a 2 and then taking the variance of the 
filtered series. 

The statistic <r£ is estimated by a weighted sum of periodogram 
ordinates: 


°b = X MUjVUwj)- 

Since &'i is proportional to the sum of n asymptotically independent 
chi-square variates with two degrees of freedom, its distribution can 
be approximated by &?, ~ cr'ixi/v, where v = Z/ifj 1 d(w ; ). 

To compare the variability of output growth on various frequency 
bands, I divided the frequency domain into three intervals: 2-5 years 
per cycle, 5-10 years per cycle, and 10 or more years per cycle. The 
last three columns of table 3 report the estimates of cr j' for each coun¬ 
try and frequency band. Approximate 90 percent confidence inter¬ 
vals are shown in parentheses. 

For Italy and France, the variance of cycles that last 10 years or 
longer is roughly twice as large as that for the United States. The 
variance of cycles lasting 10 years or less is nearly the same in all three 
countries. This confirms that the large variance ratios lor France and 
Italy are due principally to greater variability in low-frequency com¬ 
ponents of growth. 

Australia, Denmark, Norway, Sweden, and the United Kingdom 
have smoother dynamics than the United States on all three fre¬ 
quency bands. The greatest difference tends to occur at medium 
frequencies. Fluctuations lasting 5-10 years are considerably 
smoother than those in the United States. At these frequencies, the 
variances are roughly 60-85 percent smaller than that of the United 
States. Thus for these countries, smoother dynamics at frequencies 
traditionally associated with business cycles contribute to the large 
variance ratios. 

The data do not support a unique accounting for high variance 
ratios in all countries. In some, a high variance ratio is due mainly to 
greater variability at low frequencies. In others, smoother medium- 
frequency components seem to account for most of the difference. 
One fact that emerges clearly is that the United States is the only 
country that combines relatively stable long growth cycles with vari¬ 
able short growth cycles. 
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IV. Comovements of Long Growth Cycles 

The previous section considered univariate evidence on the relative 
stability of long-run growth. This section examines the covariation 
among long growth cycles. The object of this exercise is to see 
whether low-frequency growth cycles are similar across countries or 
are idiosyncratic. 

Let us first consider whether low-frequency components of growth 
are more highly correlated across countries than year-to-year changes 
in output. For example, if long-term growth is idiosyncratic but busi¬ 
ness cycles have a common international component, then one might 
expect annual growth to be more highly correlated across countries 
than long growth cycles. On the other hand, if the prospects for long¬ 
term growth are common to many countries while short cycles are 
largely country specific, then one might expect low-frequency compo¬ 
nents of growth to be more highly correlated than annual growth. 

This can be addressed by comparing the correlation matrix of out¬ 
put growth with the coherency matrix evaluated at frequency zero. 
Coherency is the frequency domain analogue to correlation, and it is 
defined as 

= /,/w)[/,,(w)/, ; (u>)]" 1/2 , 

where f tJ { w) is the i/th element of the spectral density matrix evaluated 
at frequency to. The diagonal elements of the spectral density matrix 
are the power spectra, and the off-diagonal elements are the cross 
spectra. The cross spectra at frequency zero are proportional to the 
covariance between long growth cycles in each pair of countries. Al¬ 
ternatively, in models that divide output into random walk and sta¬ 
tionary components, the cross spectra at zero are proportional to the 
covariance between the random walk innovations (see Cochrane and 
Sbordone 1988; Phillips and Ouliaris 1988). 

The spectral density matrix at frequency zero is estimated by 
smoothing the real parts of the periodogram ordinates in the neigh¬ 
borhood of frequency zero: 10 

m 

f^,(0) = m~ l Re(Iyy(<o,)), 
j= 1 

where 

Iyy(W 7 ) = (277D" 1 dy(a> ; )d^;)' 


10 These are just multivariate extensions of the statistics introduced in the previous 
sections. Phillips and Ouliaris (1988) indicate that a rectangular window is appropriate 
in the presence of cointegration. Tables 4 and 5 are estimated for m = 5. 
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and 

T- 1 

dy(“,) = X [ A yW ~ M exp( - iu>jt). 

/«I 

Table 4 presents estimates of the correlation matrix for output 
growth as well as the coherency matrix at frequency zero. The values 
above the diagonal are correlations, and the values below the diagonal 
are coherencies. In most cases, the coherency at zero is greater than 
the correlation. The difference is particularly large for the continen¬ 
tal European countries. Their average correlation is approximately 
one-third, while their average coherency is roughly two-thirds. 

A test for cointegration provides more formal evidence on the de¬ 
gree to which long growth cycles are related across countries. If there 
are long-run restrictions across countries that prevent output levels 
from diverging by too much, then the series will be cointegrated. For 
example, some theories of long-term growth predict that per capita 
output levels will converge. If one country were to temporarily grow 
more rapidly than the rest, the others would eventually catch up. 

Phillips and Ouliaris (1988) present a test for cointegration that is 
based on the rank of the spectral density matrix at zero. They show 
that if y(t) has n elements but only p independent unit roots, with p < 
n, fyy(O) is an n X n matrix with rank p. Thus if y(t) is cointegrated, the 
estimated spectral density matrix should have p nonzero principal 
components. They suggest the following procedure. Construct an 
upper probability bound for the ratio of the sum of the n - p smallest 
eigenvalues of f yy (0) to the sum of all the eigenvalues. If the upper 
bound is sufficiently small, one can infer that the n — p smallest roots 
are negligible. Small upper bounds can thus be interpreted as evi¬ 
dence for cointegration. 

Table 5 shows the estimates of the eigenvalues of f yy (0) as well as the 
Phillips-Ouliaris bounds test. The estimate of the last eigenvalue is 
negative but not significantly different from zero. The first three 
principal components account for 84 percent of the total variation, 
and the first five account for 94 percent. In a traditional application of 
principal components analysis, one would probably choose to retain 
three to five components. The 10 percent upper probability bound 
for the last three principal components is 5.8 percent of the total 
variation, while the upper bound for the last two is 1.7 percent. This 
seems to be small enough to accept the hypothesis of cointegration. 

The multivariate evidence indicates that long growth cycles are 
closely related across countries. Low-frequency components of 
growth are more highly correlated across countries than year-to-year 
changes in output. Further, levels of per capita output In the various 
countries are cointegrated, which implies that long-run dynamics pre¬ 
vent output levels from diverging by too much. 
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TABLE 5 

Principal Components Analysis of Long Growth Cycles 


Eigenvalues 

Percentage 

Cumulative Percentage 
tr[f r ,(0)l 

.179E-02 

.498 

.498 

.955E-03 

.260 

.758 

.295E-0S 

.082 

.840 

.203E-0S 

.057 

.896 

.156E-0S 

.043 

.940 

.102E-03 

.028 

.968 

.955E-04 

.027 

.995 

.485E-04 

.013 

1.008 

- .293E-04 

-.008 

1.000 


Phillips-Ouliaris Bounds Test for Cointegration* 


P 

B 

90% Confidence Interval 

2 

.24 

(.12, .36) 

3 

.16 

(.075, .24) 

4 

.10 

(.044, .16) 

5 

.06 

(.022, .10) 

6 

.032 

(.066, .058) 

7 

.005 

(- .006. .017) 

8 

-.008 

(-.015, -.001) 


Note —f rr (0) is the spectral density matrix at frequent \ zero for output growth in the 
various countries 

* Define B * 2?.^ + \ \/L H , w |A„ where X, is the ith eigenvalue of fyy(O) 


V. Summary 

This paper reconsiders Cochrane’s evidence on the relative stability of 
long-term growth. It estimates variance ratios for nine OECD coun¬ 
tries for the period 1871-1985. For the United States, Cochrane 
found that long-run or low-frequency components of growth are sta¬ 
ble relative to year-to-year changes in output. The results of this 
paper indicate that the relative stability of long-run growth is unique 
to the U.S. data. For countries other than the United States, the point 
estimates for the variance ratio are all considerably larger than Coch¬ 
rane’s estimate of one-third. In fact, almost all the lower probability 
bounds are larger than one-third. This evidence indicates consider¬ 
ably less tendency for long-run stability than found by Cochrane. 

Two factors account for the cross-country differences in the vari¬ 
ance ratios. For some countries, low-frequency components of growth 
are considerably more variable than those for the United States. In 
others, high- to medium-frequency components are considerably 
smoother. The contribution of each factor varies across countries. 
The United States seems to be the only country that combines smooth 
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dynamics at low frequencies and variable dynamics at high to medium 
frequencies. 

Finally, the paper examines the covariation among long growth 
cycles. Long cycles are more highly correlated across countries than 
year-to-year changes in output. In fact, levels of per capita output 
appear to be cointegrated, which implies that there are bounds on the 
degree to which output levels can diverge in the long run. 

Overall, these measures suggest that output fluctuations in the vari¬ 
ous countries are not all alike. For most countries, the evidence tends 
to support the view that a large fraction of the economy’s year-to-year 
movements leave long-term marks. It tends to contradict the view that 
long-run growth is stable relative to year-to-year changes in output. 
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Habit Formation: A Resolution of the Equity 
Premium Puzzle 


George M. Constantinides 

University of Chicago and National Bureau of Economic Research 


The equity premium puzzle, identified by Mehra and Prescott, states 
that, for plausible values of the risk aversion coefficient, the differ¬ 
ence of the expected rate of return on the stock market and the 
riskless rate of interest is too large, given the observed small variance 
of the growth rate in per capita consumption. The puzzle is resolved 
in the context of an economy with rational expectations once the 
time separability of von Neumann-Morgenstern preferences is re¬ 
laxed to allow for adjacent complementarity in consumption, a prop¬ 
erty known as habit persistence. Essentially habit persistence drives a 
wedge between the relative risk aversion of the representative agent 
and the intertemporal elasticity of substitution in consumption. 


I. Introduction 

Rational expectations, a cornerstone of modern theories in economics 
and finance, has come under attack. Are prices too volatile relative to 
the information arriving in the market? Is the mean premium on 
equities over the riskless rate too large? Is the real interest rate too 
low? Is the market’s risk aversion too high? Is the intertemporal elas¬ 
ticity of substitution in consumption with respect to changes in the 
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productivity of capital too low? Finally, is the time series of aggregate 
consumption of nondurables and services too smooth? 

Mehra and Prescott (1985) raised some of these questions in their 
equity premium puzzle. They employed a variant of Lucas’s (1978) 
pure exchange economy and conducted a “calibration” exercise in the 
spirit of Kydland and Prescott (1982). Mehra and Prescott chose the 
parameters of the endowment process to match the sample mean, 
variance, and first-order autocorrelation of the annual growth rate of 
per capita consumption in the years 1889—1978. They postulated that 
the representative agent has time- and state-separable utility. The 
puzzle is that they were unable to find a plausible pair of the subjec¬ 
tive discount rate and relative risk aversion (RRA) of the representa¬ 
tive agent to match the sample mean of the annual real rate of interest 
and of the equity premium over the same 90-year period. Stated 
differently, the consumption growth rate appears to be too smooth to 
justify the mean equity premium. 

The equity premium puzzle is not an isolated observation. Hansen 
and Singleton (1982, 1983), Person (1983), Grossman, Melino, and 
Shiller (1987), and several others rejected the Euler equation restric¬ 
tion on asset returns and the marginal rate of substitution implied by 
time- and state-separable preferences. Mankiw, Romer, and Shapiro 
(1985), Campbell and Shiller (1988), and West (1988a, 19886) found 
that) stock returns are loo volatile if future dividends are discounted 
at a constant rate. Some of the empirical literature on the consump¬ 
tion function rejected the joint hypothesis of rationality and time- 
and state-separable preferences (see Deaton 1987; Hall 1989). Black 
(1986) and Roll (1988) questioned the rationality of price changes. 
Campbell and Kyle (1988) and De Long et al. (1990) made a cas£ for 
noise traders. Finally, the stock market crash of October 1987 has 
added fuel to the debate. 

The goal of this paper is to show that the equity premium puzzle is 
resolved in a rational expectations model, once we relax the time 
separability of preferences and allow for adjacent complementarity in 
consumption, a property known as habit persistence. 

Marshall (1920) discussed the notion that tastes can be cultivated 
and that they are affected by past consumption. Duesenberry’s (1949) 
thesis on the consumption function is probably the first serious exami¬ 
nation of the implications of habit persistence. Ryder and Heal (1973) 
introduced the notion of adjacent and distant complementarity and 
discussed the stability of a growth model in the presence of habit 
persistence. Stigler and Becker (1977) argued that preferences should 
not be taken as exogenous but that it is fruitful to endogenize them 
and search for factors that explain differences or changes |n behavior. 
Kydland and Prescott (1982) introduced preferences that are non- 
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time separable in leisure. Becker and Murphy (1988) presented a 
theory of rational addiction and provided an insightful discussion on 
the link between addiction and complementarity. Sundaresan (1989) 
discussed the volatility of consumption and wealth in the presence of 
habit persistence. 

In his examination of habit formation and dynamic demand func¬ 
tions, Poliak (1970, p. 761) insisted that “a fundamental assumption 
of the habit-formation model is that the individual does not take 
account of the effect of his current purchase on his future prefer¬ 
ences and future consumption.” I see nothing fundamental in the 
association of habit formation with some form of myopia or irration¬ 
ality. In the present paper, habit persistence is introduced in a model 
with rational expectations, given that the goal is to show that the 
equity premium puzzle does not lead to the conclusion that the ra¬ 
tional expectations model is bankrupt. 

The paper is organized as follows. Habit persistence is embedded in 
a variant of the neoclassical growth model in Section II. Theorem 1 
proves existence and uniqueness of an optimal policy and presents 
the optimal policy, the derived utility of capital, and the dynamics of 
capital and consumption. Theorem 2 derives the stationary distribu¬ 
tion of the state variable and enables one to calculate the uncondi¬ 
tional mean and variance of the consumption growth rate. Section 1IC 
illustrates that the key role of habit persistence is to drive a wedge 
between the RRA coefficient and the inverse of the intertemporal 
elasticity of substitution in consumption. In Section III, I interpret 
the growth model as the equilibrium in a representative-consumer 
production economy. I resolve the equity premium puzzle by showing 
in table I that habit persistence can generate the sample mean and 
variance of the consumption growth rate with low risk aversion. In 
Section IV, I discuss alternative potential explanations of the puzzle. 
Finally, in Section V, I review related empirical evidence and offer 
suggestions for future research. 


II. Habit Persistence in a Production Economy 

A. The Model and Assumptions 

Habit persistence is introduced in a variant of the neoclassical growth 
model. The optimal consumption and investment paths are inter¬ 
preted as the equilibrium paths in a representative-consumer produc¬ 
tion economy, and the shadow prices of assets are interpreted as the 
equilibrium prices. 

There exists only one production good, which is also the consump¬ 
tion good. This good may be consumed or invested in two tech- 
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nologies. The technologies have constant returns to scale and rates of 
return over the period [t, t + dt] equal to rdt and \idt + crdw(t), 
respectively, where r, p, and o are constants and w(t) is a standard 
Brownian motion in R l . 

The infinitely lived representative consumer has capital W(t) at time 
t denominated in units of the consumption good, investing fraction 
a(t), 0 £ a(t) £ 1, of the capital in the risky technology and the 
remaining fraction 1 - a(t) in the riskless technology. The consumer 
also consumes c(t)dt in the period [<, t + dt]. Assume zero endowment 
flow and labor income. The increase in capital over [ t , t + dt] is 

dW(t) = {[(jjl - r)a(J) + r]W(t) - c(t)}dt + act (t)W(t)dw{t). (1) 

Given a consumption and investment policy, {c(t), a (/), t a 0}, the 
expected utility of consumption from time 0 to infinity is defined as 

To f * -p V‘M0 - x(t)Vdt, (2) 

Jo 

where 

x(<) — e~ a ‘x 0 + b f e a{ '~ ,) c{s)ds. (3) 

Jo 

Since lim 7 _ 0 1 “‘(y 7 - 1) = In y, the case -y -* 0 in equation 
(2) corresponds to logarithmic utility, which may be treated sepa¬ 
rately rather than cluttering the notation by replacing (c - x) y with 
(c - x) y - 1. 

The special case x 0 = b = 0 corresponds to time-separable utility 
with constant RRA, 1 — y. The novel feature of the utility function 
studied in this paper is that the subsistence level of consumption, x(t), 
is an exponentially weighted sum of past consumption. Thus utility is 
not time separable but exhibits habit persistence. The particular form 
of the habit-forming state variable, x(t), defined in equation (3), was 
introduced by Ryder and Heal (1973), who studied a two-factor 
growth model maximizing expected utility E 0 /o e ~ pl j*(c(t), x(t))dt. 

The utility function defined in equations (2) and (3) exhibits adja¬ 
cent complementarity in consumption; that is, an increase in con¬ 
sumption increases the marginal utility of consumption at adjacent 
dates relative to the marginal utility of consumption at distant ones. 
Formally, define ti) as the marginal utility of consumption at 

date. J], where the derivative takes into account the impact of the 
change in c(ti) on all future values of x(t). Define also the marginal 
rate of substitution between consumption at dates t] and t 2 , 0 < < t 2 , 

as/'(c(-). ti)IJ'(c('), tft- By specializing the results of Ryder and Heal 
(1973), one can show that along a constant consumption path, c(t) = 
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ax(t)/b = ax 0 /b, there exists a number l, ti < l < t t , such that the 
marginal rate of substitution increases when consumption c(t) in¬ 
creases, for t < l. 

Theoretical and empirical tractability is the only reason why I 
model habit formation as in equations (2) and (3). My goal is not to 
study the most general utility function that exhibits habit persistence 
but rather to employ the simplest utility specification that resolves the 
equity premium puzzle. 

The consumer’s choice of a consumption and investment policy is 
restricted to the set of admissible policies defined by the following 
four properties: (i) The consumption and investment decisions taken 
at date t are based solely on information available at date t. (ii) The 
consumption rate is nonnegative (c(t) s 0), does not fall below the 
subsistence level (c(t) S: x(t)), and results in finite total consumption 
over any horizon; that is, foc(s)ds < for all t almost surely, (iii) 
Investment in both technologies is nonnegative; that is, 0 rs; a(<) s 1 
for all t almost surely, (iv) The policy guarantees that the capital 
remains nonnegative; that is, W(l) s 0 for all t almost surely. 

An optimal admissible policy and the associated derived utility of 
capital are defined by 


V(W 0 , *o) * max E 0 | e '“•y" l [c(t) ~ (4) 

admissible 

a(»). c(i), i»0 

where W(0) = Wo and x(0) = jc 0 - I impose restrictions on the model 
parameters and motivate these restrictions. 

Assume that 


I — y > 0, y 0. 


(5) 


The case y — 0 corresponds to logarithmic utility and may be treated 
separately. Condition (5) is necessary if the RRA coefficient of the 
consumer is to be positive. As shown later, 1 — "y is only approxi¬ 
mately equal to the RRA coefficient. The equality is exact if utility is 
time separable, b — 0, and is of the power form x 0 = 0. 

Conditions (6)-(8), 

W 0 > 0, (6) 




*o 

r + a — b 


> 0 , 


( 7 ) 


and 


0 < b < r + a, 


( 8 ) 
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jointly imply that the set of admissible policies is nonempty. In partic¬ 
ular, the policy {c(<) = (r + a — b)W(t), a(() = 0, ( 2 0} implies 

W(t) = W 0 exp[(6 - a)t] > 0, 

c(t) = (r + a - b)W n exp[(6 - a)t] > 0, 

■ < 

c(s)ds < 
o 

c(t) — x(t) = (r + a - b)lw 0 - — —^-r) exp(-at) > 0, 

\ r + a — o) 

thereby satisfying all the conditions of an admissible policy. 

The condition 


P - yr - 


~ r) 

2(1 - y)o 2 


> 0 


(9) 


ensures that, under the optimal policy, the expected utility of con¬ 
sumption flow grows at a rate that is lower than the time preference, 
so that the expected utility of consumption over the infinite horizon is 
finite. It also implies that the appropriate transversality condition is 
satisfied. 

Finally, assume that 


and 


x 0 - 0 


0 s k i 1, m 


(l - y)& 


( 10 ) 


HD 


Conditions (5)-(ll) are invoked in theorem 1 to prove that an op¬ 
timal policy exists and is unique, and they lead to closed-form expres¬ 
sions for the optimal policy and for the derived utility of capital. 
Essentially, condition (10) guarantees that the condition c(t) & 0 of an 
admissible policy is nonbinding, and condition (11) guarantees that 
the condition 0 £ a(t) ^ 1 of an admissible policy is nonbinding. Then 
the optimal consumption and investment are at an interior maximum, 
and this simplification leads to closed-form expressions. 


B. Optional Consumption and Investment Policy 

In the first theorem I prove existence and uniqueness of an optimal 
policy, state the optimal'policy, state the derived utility of capital, and 
state the dynamics of capital and consumption. 
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Theorem 1. Under conditions (5)—(11), an optimal admissible con¬ 
sumption and investment policy exists, is unique, and is given by 

c*(t) = x(t) + (12) 

and 


a*(t) = ml — 


rnmn 


r + a — b 


where 


h s 


r + a — b 

L _ - r . 7(M- - ') 2 1 

(r + a)(l - -y) 

r 7 20 - -r)o^J 


(13) 


> 0. (14) 


The derived utility of capital is 

_ (r + a — b)h y ~ 1 


vmo.m = • imo - - M_ T . 


(r + a)y 


V 


r + 


(15) 


The capital is 


W(t) = — ^ -- + 
r + a — b 


("• - rfT=l) 


X 



m \y + mcnv(t) 


and the consumption growth rate is 


(16) 


dc(t) _ 
c(t) 


[«+ b - (n + + 

[i - «*> 1 

r(t) 

c(t) _ 


madw(l). 


(17) 


where 


_ r - p (p - r) 2 (2 - 7 ) 
1-7 2(1 - 7) 2 <t 2 


(18) 


The theorem is proved in Appendix A. Merton (1971) considered 
the special case a = b = 0, which corresponds to time-separable utility 
with hyperbolic absolute risk aversion. He stated the optimal policy 
and proved its optimality and uniqueness. Sundaresan (1989) stated 
the optimal policy in two cases of nonseparable utility, a = b and a £ 
b, but the direct utility exponential is c — x. 

In addressing the equity premium puzzle, I interpret the optimal 
paths specified by theorem 1 as the equilibrium paths in a representa¬ 
tive-consumer economy. In particular, the consumption growth rate, 
specified by equation (17), is interpreted as the per capita consump- 
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tion growth rate. The mean and variance of the consumption growth 
rate are functions of the state variable x(t), which appears as the ratio 
z(t) » x(t)/c(t). Theorem 2 states conditions under which this ratio has 
a stationary distribution and presents this distribution. This distribu¬ 
tion is used to calculate the unconditional mean and variance of the 
consumption growth rate. 

The RRA coefficient is defined in Section IIC and is shown to be a 
function of the state variable x(t), which appears as the ratio y(t) = 
x{t)l[c(t) - *(<)] = z(0/[l - z(t)]. Theorem 2 states conditions under 
which this ratio has a stationary distribution and presents this distri¬ 
bution and the mean ofy(t). This distribution is used to calculate the 
unconditional mean of the RRA coefficient. 

Theorem 2 . Assume that conditions ( 5 )-(ll) hold and also that 

n + a — b — m 2 ar 2 >0. (19) 

Then (i) y(t) = x(t)/[c(t) — *(<)] has a stationary probability distribution 
with density 

Py(y) = + 0 < y < oo, (20) 


where 


*-* 


{ 2b 

TO? 


\ 1 — 2(n 4- a — A)/m 2 ir 2 

1 r 

"2 (n + a — b) 

1 1 

_2_2 1 

L mu 


( 21 ) 


and r( ) is the gamma function. For the stationary distribution,)! has a 
single mode jl, 


y = 


b 

n + a — b 


< 00, 


( 22 ) 


and mean 


y = 


n + a — b — m 2 cr 2 


< 00. 


(23) 


(ii) z(t) x(t)/c(t ) has a stationary probability distribution with density 

— ^2S/w 2 o- 5i ( ] ■ ^2(w + a — b — — a »y»n 2 a a ^ — 


0 < z < 1. 

For the stationary distribution, z has a single mode £, 

„ _ n + a — [(n + a) 2 — 4 m 2 a 2 b] ir ‘ 2 
2m 2 u 2 


(24) 


( 25 ) 
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The proof is given in Appendix B. Using equation (17), we can 
calculate the unconditional mean and variance of the consumption 
growth rate as 

—’- = n + b - (n + a) zp z (z)dz (26) 

and 

Sg*) , ,v [' (1 - (27) 

The density p z (z) is given in theorem 2, and the integration is done 
numerically since we are unable to obtain closed-form expressions for 
the integrals. 


C. A Wedge between the RRA Coefficient and the 

Inverse of the Intertemporal Elasticity of Substitution 
in Consumption 


I define the RRA coefficient and the intertemporal elasticity of sub¬ 
stitution in consumption ( s). I show that the product s • RRA equals 
one in the time-separable model (b = 0) but is substantially below one 
in the nonseparable model and for the particular parameter values 
that resolve the equity premium puzzle. Thus habit persistence drives 
a wedge between the RRA coefficient and the inverse of the intertem¬ 
poral elasticity of substitution in consumption. 

I define the RRA coefficient in terms of an atemporal gamble that 
changes the current level of capital by the outcome of the gamble and 
is given 


RRA 


- WV„ 


_ 1 ~ 1 _ 

1 - {x(I)/tW«)(r + a - &))} ' 


(28) 


This definition is consistent with that in Giovannini and Weil (1988) 
for Kreps-Porteus preferences. In the context of an intertemporal 
model it would be improper to define the RRA coefficient in terms of 
an atemporal gamble that changes either current consumption or 
consumption at some specified future date by the outcome of the 
gamble. 

The RRA coefficient is a function of wealth and of the state variable 
x(t). A sudden drop in wealth leaves x(t) unchanged in the short run 
and increases the RRA coefficient. This drop is only temporary be¬ 
cause the RRA coefficient has a stationary distribution. To see this, we 
can use equation (12) to eliminate W(t) from equation (28) and obtain 

m - “ - 4 ' + TTTr-j} 


( 29 ) 
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The RRA coefficient has a steady-state distribution because y(t) does, 
by theorem 2. Since the mean value of y(t) in the steady state is given 
by equation (23), we obtain the mean of the RRA coefficient as 


RRA = (1 



_ hb _' 

(r + a —b)(n + a - b - m 2 cr 2 )_ 


(30) 


With the parameter values that resolve the equity premium puzzle, I 
show in Section III that the mean of the RRA coefficient is of the 
same order of magnitude as I - y. 

The elasticity of substitution in consumption is defined here as the 
derivative of the expected growth rate in consumption with respect to 
r, with z(l), (x - r, and <r 2 held constant: 


s 


d[E(dc/c)/dt] 
dr 


\z(t).^~r.a 2 


1 ~ z(0 
l - y 


(31) 


Note that the elasticity may also be defined as the inverse of the 
expression —cu rr /u c . I stress, however, that this expression need not 
equal the RRA coefficient because risk aversion is defined in terms of 
an atcmporal gamble that changes wealth and not in terms of a gam¬ 
ble that changes consumption. 

We can combine equations (29) and (31) and write the product of 
the elasticity of substitution and the RRA coefficient as 


s • RRA = (1 - z)(l + - y - —j- 

( r + a — b 


= 1 



■ * - V 

r + a - b) 


(32) 


To consider the special case of time-separable utility, let b —» 0. By 
equation (25) the modal value of z tends to zero, and therefore the 
modal value of the product s ■ RRA tends to one. Note that we do not 
assume that x () - 0 or a > 0; therefore x(i) need not vanish as b —* 0. It 
is the assumption that utility is time separable, and not the stronger 
assumption that x(<) vanishes, that gives the result that the product 
s ■ RRA has modal value one. 

In his insightful exposition of growth theory, Solow (1970, p. 85) 
proved in the context of a deterministic growth model with time- 
separable utility that the consumption growth rate is linear in the net 
marginal product of capital, with the coefficient equal to the inverse 
of the RRA coefficient. Put differently, the intertemporal elasticity of 
substitution in consumption is the inverse of the RRA coefficient. The 
assumption that consumption growth is deterministic is not crucial. 
Hansen and Singleton (1983), Breeden (1986), and Hall (1988) ex¬ 
tended the result under uncertainty by making reasonable assump- 
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tions about the stochastic process of consumption and the rates of 
return. 

With habit persistence, the mode of the stationary distribution of z 
is given by equation (25). In the next section 1 show that the modal 
value of s • RRA is substantially below one at the parameter values 
that resolve the equity premium puzzle. 

Hansen and Singleton (1982, 1983), Ferson (1983), Grossman et al. 
(1987), and others rejected the Euler equation restriction implied 
by the time-separable model. Hall (1988) argued that since the time- 
separable model forces the product s • RRA to equal one, these 
results are rejections of the Euler equation and the hypothesis that the 
product s ■ RRA equals one. Ferson and Constantinides (1989) re¬ 
jected the Euler equation implied by the time-separable model when 
the alternative hypothesis is the Euler equation implied by the non- 
separable model. These results may be interpreted as evidence 
against the restriction s ■ RRA = 1. 

For the equilibrium of the particular model developed in this sec¬ 
tion, we may write the capital elasticity of consumption as 


ddc 

dWIW x 


s RRA 


(33) 


and the ratio of the standard deviation of the consumption growth 
rate and the capital growth rate as 


std(rfc/c) 
std (dW/W) 


j RRA. 


(34) 


Since habit persistence allows the product s ■ RRA to be substantially 
below one, we can conclude that habit persistence smooths consump¬ 
tion growth over and above the smoothing implied by the life cycle- 
permanent income hypothesis with time-separable utility. 


III. Resolution of the Equity Premium Puzzle 

I interpret the growth model developed in Section II as the equilib¬ 
rium in a representative-consumer production economy. The optimal 
consumption path is interpreted as the per capita consumption. 

Mehra and Prescott (1985) estimated the mean of the annual 
growth rate of per capita real consumption of nondurables and ser¬ 
vices in the years 1889-1978 to be .0183 with a range - .0025, .03 in 
subperiods. They also estimated the standard deviation of the growth 
rate in the years 1889-1978 to be .0357 with a range .010, .0528 in 
subperiods. In terms of our notation, we want the model stated in 
Section II to imply E(dclc)ldt - .0183 per year and var (dclc)ldt = 
(.0357) 2 per year. 



53° JOURNAL OF POLITICAL ECONOMY 

Mehra and Prescott estimated the mean annual real rate of return 
on a relatively riskless security to be .008, using 90-day Treasury bills 
in the 1931-78 period, Treasury certificates in the 1920-30 period, 
and 60-90-day prime commercial paper in the 1889-1920 period. 
Thus we can set r = .01 per year. 

Let us introduce a firm that has capital K(t) at time t. The firm has 
free access to the two production technologies. It invests capital b)K(t) 
in the risky technology and the remaining capital (1 - 8i)/C(<) in the 
riskless technology, where 8) is a constant, 0 < 8j s 1. The firm is 
financed with equity of value S(t) and riskless debt of value B(t). The 
firm maintains the ratio S(t)/[S(t) + B(t )] = 8 2 constant, 0<6jSl. 
Since the firm has free access to the constant-returns-to-scale tech¬ 
nologies, the value of the firm equals its capital, that is, S(t) + B(t) = 
K(t). Since the bonds are riskless, their rate of return is dBIB = rdt. 
Denoting by dSIS the rate of return on equity, we obtain 

dS(t) + B(t)rdt = b\K(t)[p.dt + <jdw(t )] + (1 — 8) )K(t)rdt, 
which simplifies into 

~ T ^ dt + adw ( t )] + rdt - (35) 

I interpret the equity of the firm as a portfolio of the stocks repre¬ 
sented in the Standard and Poor’s composite stock price index. Given 
the leverage (8 2 ) of the firms represented in the index, the ratio 8,/8 2 is 
free in the range 0 < 8 i/ 8 2 < 8 2 - 1 since the parameter 81 is free in the 
range 0 < 8i s 1. In our calculation, we can set 8 j/ 8 2 = 1, which is 
consistent with any amount of leverage. 

Mehra and Prescott estimated the annual real return on the Stan¬ 
dard and Poor’s composite stock price index in the 1889-1978 period 
to have mean .0698 (with range -.0014, .1896 in subperiods) and 
standard deviation .1654 (with range .002, .2790 in subperiods). 
These estimates are generally consistent with those by Ibbotson and 
Sinquefield (1982, p. 15). Thus we can set 

= (~j^-)(p - r ) = 06 P er y ear < 36 ) 

and 

var(dS/S) / 8] \ 2 2 1CExZ , 0>7 ^ 

— it = ur) a = (165) per year - (37) 

The mean and variance of the consumption growth rate are inde¬ 
pendent of the ratio fi78 2 . Equations (26) and (27) show that the mean 
and variance of the consumption growth rate depend on the parame¬ 
ters p. and <t only in the combination (p - r)k r = .06/. 165, which is 
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independent of 81 / 82 - However, condition ( 11 ) requires 1 — 1 a 

2 . 2 ( 8 ,/ 8 2 ). 

In the context of the production economy, time-separabie utility 
implies that the RRA coefficient equals 10.2; hence the equity pre¬ 
mium puzzle. To see this, time-separable utility implies that b-* 0 and 
that the modal value of z(t) is zero. Then 

= mV - <£-'>*■ - (.0357) a , 
at cr (1 - y y 

which implies 1 — y = 10 . 2 , irrespective of the ratio S,/5 2 . 1 

I proceed to show that habit persistence can generate the sample 
mean and variance of the consumption growth rate with a low RRA 
coefficient. Let us set p = .037 per year, 1 - y = 2.2, and 8 ,/ 8 2 = 1. 
The reader may verify that these parameter values, together with the 
parameter values specified in equations (36) and (37), satisfy the 
model conditions (5), (9), and (11). 

Let us consider pairs of parameter values (a, b) that satisfy the 
conditions ( 8 ) and (19). For each pair (a, b ), the stationary distribution 
of z is given by equation (24). We can calculate the mean and variance 
of the consumption growth rate by performing the numerical integra¬ 
tion in equations (26) and (27). Table 1 reports pairs (a, b) for which 
the mean and variance of the consumption growth rate match their 
sample estimates. 

The table also reports the mean RRA coefficient. As one shifts to 
the right of the table, the mean RRA coefficient decreases and ap¬ 
proaches the value 1 - 7 = 2.2. The equity premium puzzle is re¬ 
solved in the sense that the model generates the mean and variance of 
the consumption growth rate with the mean RRA coefficient as low as 
2.81. 

If the value of 2.81 is not sufficiently low relative to the reader’s 
prior on the RRA coefficient, we can generate the target mean and 
variance of the consumption growth rate with a lower RRA coefficient 
by setting a lower value for 1 — 7 . Now in order to satisfy the condi¬ 
tion (11), we have to set 8 ,/ 8 2 > L If we assume that the firm has a 
debt/equity ratio equal to one, then 8 2 = .5 and we can set 8 ,/ 8 2 = 2 


1 Friend and Blume (1975) estimated the demand for risky assets and inferred the 
RRA coefficient to be well below 10, under the assumption that the investment oppor¬ 
tunity set is constant. Black (1988) and Kocherlakota (1988) pointed out that the Friend 
and Blume inference of the RRA coefficient is invalid if the investment opportunity set 
is not constant. An alternative source of estimates of the RRA coefficient, which does 
not rely on the assumption of a constant investment opportunity set, is based on the 
Euler equation implied by time-separable and non-time-separable utility functions. 
Typically, risk aversion is estimated to be well below 10. Some of this literature is 
reviewed in Sec. V. 
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TABLE 1 

Mean and Variance of the Consumption Growth Rate Generated 
by the Model with Habit Persistence 


Parameter a, per year 

.1 

.2 

.3 

.4 

.5 

.6 

Parameter b 

.093 

.172 

.250 

.328 

.405 

492 

Mode (£) of the state 
variable z 

.86 

.82 

.81 

.80 

.79 

.81 

Mean annual growth 
rate in 

consumption: 
Unconditional mean 

.018 

.019 

.018 

.018 

.018 

.018 

At t = £ 

.011 

.013 

.014 

.014 

.014 

.014 

Standard deviation 
of the annual 
growth rate in 
consumption: 

Unconditional mean 

.036 

.036 

.036 

.036 

.036 

.034 

At z = i 

.023 

.029 

.032 

.033 

.034 

.032 

RRA coefficient: 
Unconditional mean 

8.67 

4.37 

3.47 

3.09 

2.88 

2.81 

At 2 “ i 

7.03 

4.09 

3.36 

3.03 

2.84 

2.78 

Elasticity of substi¬ 
tution (i) 
at z = £ 

.06 

.08 

.09 

.09 

.09 

.09 

s ■ RRA at z = f 

.42 

.33 

.30 

.27 

.26 

.25 


Non: —The assumed parameter values are r = 01 , the annua! rate of return of the riskless technology, ji - r * 
.06, the difference between the mean annual rate of return of the risky technology and the annual rate of return of 
the riskless technology; tr - .165. the standard deviation of the annual rate of return of the nsky technology, y = 
- 1.2, the power in the utility function; and p - .037, the rate of time preference in units (year)' 1 


and 1 - 7 = 1.1. By judicious choice of parameters (a, b), we can 
generate the target mean and variance of the consumption growth 
rate with the RRA coefficient close to 1.1. 

An interesting feature of table 1 is that the modal value of the state 
variable z(t) = x(t)/c(t) is about .8 for all the reported (a, b) pairs. The 
model predicts that the subsistence level of consumption, x(t), gener¬ 
ated by habit persistence, is about 80 percent of the level of consump¬ 
tion. This prediction is discussed further in the last section. Another 
interesting feature of the table is that the intertemporal elasticity of 
substitution in consumption is substantially below one. Finally, the 
product of the elasticity of substitution and the RRA coefficient is 
about .25 for the pairs (a, b) that resolve the equity premium puzzle 
with a low RRA coefficient. This illustrates the key role of habit per¬ 
sistence in resolving the puzzle by driving a wedge between the RRA 
coefficient and the inverse of the elasticity of substitution. 

IV. Discussion 

We have resolved the equity premium puzzle by relaxing Mehra and 
Prescott’s (1985) assumption that utility is time separable. However, 
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our economy differs from theirs in two other respects as well. First, 
our economy allows for production while theirs is an exchange econ¬ 
omy. Second, theirs is a discrete-time economy in which the state is a 
Markov process with two realizations, while ours is a continuous-time 
economy in which the forcing process is a diffusion. To make the case 
that habit persistence is the key to the puzzle, we need to demonstrate 
that these two differences in modeling the economy are inessential. 

The first difference is inessential because, as Mehra and Prescott 
(1985) and Mehra (1988) pointed out, the task of explaining the puz¬ 
zle in a production economy is not easier than in an exchange econ¬ 
omy. The introduction of production does not increase the set of joint 
equilibrium processes on consumption and asset prices. In fact it may 
be harder to explain the puzzle in a production economy because the 
consumption process is no longer exogenous but must be obtained as 
the equilibrium outcome. 

The second difference is inessential as well. I demonstrate that time 
separability in preferences is the key restriction that generates the 
puzzle in Mehra and Prescott’s economy. Let m ,,, be the marginal 
rate of substitution and R Ft be the one-plus riskless rate of interest 
between periods t and / + 1. The Euler equation states that 

E(m l+l R Ft \I t ) = 1, (38) 

where I, is the public information in period t. Since R F , is in the 
information set /„ we can write E(m,+ \\I,) = R F , X and, by Jensen’s 
inequality, express the unconditional mean of the marginal rate of 
substitution as 

E(m) = E(Ry *) > [£(/?,)]“ '• (39) 

Let equity have one-plus rate of return R t +\ - The Euler equation 
states that 


E(m,+ iR,+ i\Ii) = 1 (40) 

and 

E(mR) = 1. (41) 

Following the methodology of Hansen and Jagannathan (1988), we 
can write 


1 - E(mR) = E(m)E(R) + co\(m, R) 

a E(m)E(R) — std(m) sld(R) 


E(R f ) 


- std(wi) std(/?) 
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by equation (39) or 
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std(m) 2: 


[E(R)/E(R f )] - 1 
std(R) 


(42) 


Assume that utility is separable and of the form X,*_o 3S'” l cJ. Then 
the marginal rate of substitution is m t +1 = 3(c, + Further as¬ 

sume that the consumption growth rate is bounded by 


g i 


S g 2- 


Then the marginal rate of substitution is bounded by 

Pg^ 1 . *3g7- 1 , 

and its standard deviation is bounded by 


std(m) :£ 




, 7 -> 


3d' 


(43) 

(44) 

(45) 


Combining inequalities (42) and (45), we obtain a lower bound on the 
RRA coefficient, 1 - y, as 

, 46 , 

Mehra and Prescott’s parameter estimates are E(Rf) = 1.01 per 
year, E(R) = 1.07 per year, and std(/?) = .165 per year. They as¬ 
sumed a two-state Markov process for the annual consumption 
growth rate. By the method of moments they estimated the annual 
consumption growth rate to be .982 or 1.054. Thus we can setgi = 
.982 and g 2 = 1.054. Our restriction (46) on the RRA coefficient 
becomes 


.982 7 ” 1 - 1.054^” 1 2 (47) 

For 3 = .8, the lower bound on the RRA coefficient 1 - y is greater 
than or equal to 16; for 3 = .9, 14; forp = 1,12; for 3 = 1.1, 11; and 
for 3 = 1.2, 10. The risk aversion is high, thereby illustrating the 
equity premium puzzle. Note that this conclusion is independent of 
the firm’s leverage and of the correlation between consumption 
growth and the dividends on equity. 

Essentially the lower bound on the consumption growth rate puts 
an upper bound on the marginal rate of substitution that is severe if 
utility is time separable. This causes the inability to explain the mean 
premium on equity fWurns. Rietz (1988) recognized the pivotal role 
of the lower bound on consumption growth. He proposed a model 
that allows for a disaster state, in which consumption may drop by as 
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much as 25 percent in one year. The model generates sufficient vari¬ 
ability in the marginal rate of substitution and explains the observed 
mean premium on equity. Mehra and Prescott (1988) responded that 
the existence of such disasters has testable, but empirically unob¬ 
served, economic implications at times of impending disaster, such as 
the period of the Cuban Missile Crisis. 

A plausible explanation of the puzzle, suggested by Mehra and 
Prescott, is that consumers are heterogeneous and the market is in¬ 
complete. Bewley (1982), Mankiw (1986), Scheinkman and Weiss 
(1986), and Scheinkman (1989) presented models with uninsurable 
risks. They illustrated that an econometrician may grossly overesti¬ 
mate risk aversion based on per capita consumption. The individuals’ 
consumption growth rate may be substantially more variable than the 
per capita growth rate. Even with low variability of per capita con¬ 
sumption, the individuals' marginal rate of substitution may be 
sufficiently variable to explain the observed mean premium on equity. 

Brainard and Summers (1987) and Kocherlakota (1988) allowed 
the equity to be levered, and Kocherlakota allowed the subjective 
discount rate to be negative (p < 0, i.e., 3 > 1). They found that the 
RRA coefficient must exceed 10 in order to generate the observed 
mean equity premium. This conclusion is consistent with the bounds 
on risk aversion derived in this section for time-separable utility. 

Kocherlakota (1987) and Weil (1987) considered preferences that 
are not von Neumann-Morgenstern and found that the RRA coeffi¬ 
cient must be high to explain the puzzle. 

Nason (1988) generated the observed mean premium on equity by 
introducing state-nonseparable preferences in which the direct utility 
of consumption depends on past output. Whereas equilibrium con¬ 
sumption equals output in his model, the Euler equation and price 
paths are different from those implied by a direct utility function that 
depends on past consumption. One may view Nason’s model as one in 
which utility exhibits habit persistence but the representative agent is 
myopic in that the agent disregards the effect of current consumption 
decisions on future utility. 

The model in this paper generates the requisite high variability in 
the marginal rate of substitution in consumption with relatively low 
variability in the consumption growth rate through habit persistence 
in utility and low risk aversion. Essentially past consumption gener¬ 
ates a subsistence level of consumption (which must be about 80 per¬ 
cent of the normal consumption rate in order to explain the mean 
equity premium, as in table 1). A small drop in consumption gener¬ 
ates a large drop in consumption net of the subsistence level and a 
large drop in the marginal rate of substitution that makes it possible 
to match the observed equity premium with low risk aversion. 



JOURNAL OF POLITICAL ECONOMY 


5S 6 

V. Concluding Remarks 

One prediction of habit persistence is that the subsistence rate of 
consumption is positive. For the particular parameter values that ex¬ 
plain the observed mean premium on equity, the subsistence rate of 
consumption is about 80 percent of the recent past consumption rate. 

Habit persistence and durability of goods are opposing effects in 
that habit persistence tends to make certain lag coefficients in the 
Euler equation negative while durability tends to reverse their signs. 
Dunn and Singleton (1986), Eichenbaum, Hansen, and Singleton 
(1988), Gallant and Tauchen (1989), and Eichenbaum and Hansen 
(in press) used monthly data and estimated positive coefficients that 
are interpreted as evidence of durability. However, Ferson and Con- 
stantinides (1989) used quarterly and annual data and estimated 
negative coefficients that are interpreted as evidence of habit persis¬ 
tence with the subsistence level of the predicted order of magnitude. 
Furthermore, they rejected the time-separable model in favor of the 
model with habit persistence. Hansen and Jagannathan (1988) also 
found evidence in favor of habit persistence, using monthly data. 
Finally, Heaton (1988) examined the monthly and quarterly autocor¬ 
relations in consumption, while taking into account time aggregation, 
and interpreted his results as evidence of habit persistence. 

Habit persistence departs from the familiar paradigm of state- and 
time-separable preferences. To become the new economic paradigm, 
habit persistence ought to be embedded in models of the business 
cycle, labor behavior, public finance, and so forth with preferences, 
technologies, and dynamics richer than the ones introduced in this 
paper and its predictions validated by empirical testing. 


Appendix A 

Proof of Theorem 1 

a) The proof employs a technique that is applied in a different context by 
Davis and Norman (1987). Assume that an optimal policy (c(s), a(s), l s ,v) is 
given by c(s) = c*(s) and a(s) = a *(s), where c*(s) and a*(.s) are defined in 
equations ( 12 ) and (13). I shall prove that the optimal admissible policy for 
0 ss j s» t is unique and is also given by c(s) = c*(s) and a(s) = a*(s). 

6) For jil, the capital increase is 

iW(s) - ([(p. - r)a*(s) + rjW<s) - c*(s)}ds + <ra*(s)W(s)dw(s). (Al) 


Also, 



<£c(s) = (6c*(i) - ax{s)]rh. 


(A2) 
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Therefore, 


^W(i) - = |[(|i - r)a*(s) + r]W(i) - c*(s) 

- -*- S) + aa*(s)W(s)dw(s) (A3) 

r + a - b ] 

= W(s) - ———j[mfe + madui(s)}, 


which implies 


W(s) 


d In 
with solution 

W(t) - ~~— r « IW) - -~ (<) -1 

r + a - b L r + a - b J 


- -^—- = in - m + madw(s) (A4) 

r + a — bj \ 2 / 


exp 


(n _ — j(s - t) + mo[a)(.t) - w(t)] 


(A5) 


, s S I. 


For iS(, 

,-P0-0 £( {[ f « w _ x(j)f} 


= ft 1 ' 


W(t) r 


*(/) 


r + a — b 
X £', expi7OTO-[ui(t) - u/(i)]} 


/ TW^CT^ \ 

exp -p(s - 0 + t(«-^—)(■' - 0 


(A6) 


- *’h - 7TT=rH[- p + i' - 4 1 ) + - » 


- 0 
6 


since 


'P + 71 


/ wror \ 

V 2 ) 


2—2 x „, 2 „ 2„2 


! 7 z mV _ 1 

+ 2 1-7 


7(p - rf 

p - 7 r -—-—„ 

L 2(1 - 7)<T 2 




(r + a)h (A7) 

r + a - b 
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V(W(t),x(t)) = E, W<i) - *(i)prfs 


_ ' 
y . 


W(t) - 


*«> 


+ a - b 


_ (r + a - b)h y ~ 1 
(r + a)y 


W(I) - 



(r + a)h(s — t) 
r + a - b 


x(t) v 

r + a - b 


■ ds 


since (r + a)hl(r + a - b) > 0. Note also that V(lV(t), x(t)) is a C 2 function. 
c ) Define 


M(t) 


[ - *(s)]Vt + f-oVCWW, *(<)) (A9) 

Jo 


for an arbitrary policy {c(s), a(s), 0 £ s < /}. Applying Ito’s lemma, we obtain 
<LM(t) — N(t)dt + e~ ^aa\VV w dw(t), where 


N(l) = e ‘“(y l (c - x) y - pV + {[(p - r)a 4 - r]W - r}V„, 
+ W 2 V uv , + (be - ax)v}j. 


(A 10) 


Since V um , < 0, N(t) is concave in (c, a). Suppressing momentarily the condition 
OiaSl and maximizing N(t) with respect to (c, a), we obtain the first-order 
conditions that are necessary and sufficient: 

(c - *)•»-' - V w + bV x = 0 (All) 

and 

(p - r)WV w + a ej 2 W 2 V un , = 0. (A 12) 

Solving, we obtain 

c = x + h(w — ---\ = r* (A 13) 

\ r + a - bj 

and 


a 



x/W \ 
r + a - b) 


(A14) 


Substituting c*(t) and a*(<) in N(t), we obtain N(t) = 0. Therefore, 
dM(t) :E e~‘“o.erWV U ,dui(t) for arbitrary (c, a) 

(A15) 

= e'^aaWVwdwft) for (c = c*,a = cx*), 
and M(t) is a supermartingale. Thus 

E o J o - x(s)P<k = E 0 M(cc) s EM(0) = V(W„, x 0 ) (A16) 
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with equality iff c(s) = c*(s ) and a(j) *= a*(s), for all s,Osj, Therefore, the 
optimal policy for 0 £ s < t is unique and is given by c(s) = c*(s) and a(s) = 
«*(*)• 

d ) The policy (c*(<), “*(0> < — 0} obviously fulfills condition i of an admis¬ 
sible policy. It also fulfills condition ii. To see this, we proceed as in part b 
and derive (A5). Since Wo - [*o/(r + a - 6)] > 0, it follows that W(<) - 
[x(t)!(r + a - b)] > 0. Since h > 0, it follows that c(t) - x(t) > 0. Since x 0 > 0, 
then c(0) > 0, x(t) > 0, and c(t) > 0. Finally, f' 0 c(s) < for all t almost surely. 

The policy also fulfills condition iii of an admissible policy. Since OSmSl 
and 0 < 1 - {[x(t)IW(t)]/(r + a - b)} < 1, it follows that 0 £ a*{t) < 1. 

Finally, W(<) > 0 since W{t) - [ x(t)/(r + a - b)] > 0. This fulfills the last 
condition of an admissible policy. 

e) Under the optimal policy, equation (A8) gives the derived utility of 
capital as in equation (15). Equation (A5) gives the capital as in equation (16). 

/) To find the consumption growth rate, we can use (A 13) and (A5) to write 


ln(c 


- x) = In h + In(W--- 

\ r + a — b) 


= In h + ln^W 0 -—— --j + — W ^ T + mcrw(t), 

dc - dx 


(A17) 


ndl + madw(t). 


(A18) 


— = — [dx + (c - x)ndt + (c — x)madui(t)] 
c c 

= — [be - ax + (c — x)n]dt + ^1 — ^jm<jdw{t) (A 19) 

= n + b - + + | l - ~jmadw(l), 


which proves (17). 


Appendix B 

Proof of Theorem 2 

i) We can combine equations (3) and (17) and obtain the diffusion equation 
For z as 

dz = [b — (n + a - m 2 <r 2 )z — m 2 cr 2 z 2 ](l — z)dt — z(l — z)imdw{t). (Bl) 

From (Bl) we obtain the diffusion equation for y = z/( 1 - z) as 

dy - [b - (n + a — b — m 2 a 2 )y]dl - mcrydw(t). (B2) 

The forward, or Fokker-Planck, equation for the density p y (y{()', yo< t),0 < l < 
w, is 

— (wtVy 2 ^) - {[b - (n + a - b - m 2 a 2 )y\p,} = (B3) 
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This forward equation is a member of the class of forward equations studied 
by Wong (1964). Specializing the results of Wong for the problem at hand, we 
conclude that, for 0 s y < », the stationary distribution of p y (y) exists and is 
the solution of the Pearson equation 

4- (m 2 <j 2 y 2 p y ) -[b-(n + a- b- m 2 a 2 )y]p y = 0 (B4) 

£. ay 

subject to the normalization 

[ # py(: y)dy = i- (B5) 


We rewrite equation (B4) as 


Vtm 2 cr 2 y 2 — p y 
dy 


- [b — (n + a - b)y]p y — 0. 


(B6) 


The solution of equation (B6) is equation (20), and the normalizing constant k 
is stated in terms of a gamma function in equation (21). 

A mode, y, of the stationary distribution satisfies dp y ldy = 0, and we obtain 
b - (n + a - b)j = 0 with unique solution given by equation (22). We 
integrate equation (B6) by parts and obtain 


Vim 2 a 2 [y 2 p y ]o - m 2 a 2 


'fit 

o ^ ~ 


b + (n + a - b) 


ypydy = 0. (B7) 


Inspection of equation (20) implies y 2 p y —* 0 as y -» 0. Also, condition (19) and 
equation (20) imply y 2 p y —♦ 0 as y —► Then equation (B7) gives the mean 
value of y as in equation (23). Condition (19) guarantees that the mean is 
finite. 

ii) Since z = y/( 1 + y) and 0 s y < <®, it follows that z is monotone increasing 
in yin the domain of y and 0 s z < 1. Since y(t) has a stationary distribution, so 
does z(t). The density of the stationary distribution of z is p^z): 


pz(z) = py(y) ~ = U - Z > 


(B8) 


Combining equations (B8) and (20), we obtain equation (24). 

id is a mode of the stationary distribution, it must satisfy dp z (z)/di = 0, 
which, on simplification, becomes 

»» 2 ct 2 £ 2 — (n + a)£ + b = 0. (B9) 

The left-hand side of (B9) equals b > 0 at 2 = 0 and m 2 a 2 - n - a + 6<0at 
2=1. Therefore, the quadratic has only one root in the range 1. The 

stationary distribution has a single mode at 2 given by equation (25). This 
completes the proof. 
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Relatively little is known about the behavior and performance of 
firms organized as partnerships. In this paper we attempt to fill that 
gap by developing and testing a model of the effect of alternative 
compensation arrangements on productive efficiency in medical 
group practices. The technique employed is maximum likelihood 
production frontier estimation. We provide a simple behavioral 
model of the determination of productive efficiency and a new inter¬ 
pretation of the economic measure of technical efficiency. We derive 
a “behavioral production function” that relates production to indi-' 
vidual responses to incentives, and we indicate the impossibility of 
recovering the parameters of the production technology from ob¬ 
served behavior. Overall, the empirical results indicate that incen¬ 
tives do affect productivity. A larger number of members in a group 
decreases productivity while greater average experience leads to 
greater productivity. 
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I. Introduction 

The role of the services sector in the economy has grown increasingly 
large, and partnerships are a prevalent form of organization in this 
sector. Partnerships can choose among a wide variety of methods to 
compensate their members. However, relatively little is known about 
how these methods affect the behavior of members and the perfor¬ 
mance of the partnership firm. In this paper we attempt to fill that 
gap by developing and testing a model of the effect of alternative 
compensation arrangements on productive efficiency in medical 
group practices. 

There is a vast theoretical literature on compensation, organiza¬ 
tional form, and efficiency in firms (e.g., Alchian and Demsetz 1972; 
Williamson 1975; Holmstrbm 1982; Nalebuff and Stiglitz 1983; 
Mookherjee 1984; Holmstrbm and Tirole 1989), but the empirical 
literature is comparatively sparse. The basic theoretical results are 
that productivity-based compensation arrangements (e.g., piece rates) 
are best when production is nonjoint across agents. Jointness in pro¬ 
duction calls for some kind of sharing of revenues, costs, or profits 
plus monitoring where observability is possible. If observability is im¬ 
possible, bonus-penalty schemes work best. When there is a significant 
stochastic component to production and agents are risk averse, tour¬ 
naments may be preferred. 

Some empirical evidence on this matter for partnerships has been 
provided by the literature on the economics of medical group prac¬ 
tices and legal group practices. 1 For medical practice, Newhouse 
(1973) provided evidence of “behavioral diseconomies of scale” (or 
shirking) under equal-sharing arrangements as siz.e increases. 
Reinhardt, Pauly, and Held (1979), using a more comprehensive data 
set, estimated a production function that examined the effect of com¬ 
pensation arrangements. They found evidence that productivity- 
based compensation arrangements do lead to greater productivity. 
Similar evidence on legal practice is represented by Leibowitz and 
Tollison (1980). Both of these studies show “shirking” present under 
equal-sharing, non-productivity-based compensation arrangements. 

The work in this paper differs from these prior studies in both 
focus and technique. The intent of this research is to uncover the 
determinants of productive efficiency in medical partnerships. The 
technique employed is estimation of production function frontiers by 
maximum likelihood methods. This technique allows for and mea¬ 
sures differences across agents in tastes for producing efficiently or 


1 See Baker, Jensen, and Murphy (1988) for a review of the evidence on executive 
compensation in corporations. 
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the ability to do so. In the econometric literature on frontier produc¬ 
tion functions, productive efficiency is assumed to be exogenously 
given. In this paper we provide a simple behavioral model of the 
determination of productive efficiency and a new interpretation of 
the econometric measure of technical efficiency. We derive a “behav¬ 
ioral production function” that relates production to individual re¬ 
sponses to incentives, and we indicate the impossibility of recovering 
the parameters of the production technology from observed behav¬ 
ior. Finally, we treat the compensation structure facing the partners 
as chosen by or for the partners and deal explicitly with that en¬ 
dogeneity in the empirical methods used. 


II. The Behavioral Model 

In this model the partners are assumed to be utility-maximizing 
agents who make decisions over “effort” in response to the incentives 
present in the firm’s compensation method. 2 The compensation struc¬ 
ture is treated as fixed by any partner, although it is endogenous as 
far as the group as a whole is concerned. 3 


A. The Formal Model 

We model the production process as dependent on the usual physical 
inputs and on the productive “effort” partners exert. Effort is defined 
as a variable input supplied by an individual partner that determines 
the efficiency of production. In order to highlight the efficiency as¬ 
pects of production, all other inputs are assumed to be chosen at die 
firm level, 4 and all partners are assumed to be identical. 

Production is described by the production function 

f(h ti lit kt , 8 ,), ( 1 ) 

where q, is the quantity produced by partner i, 5 h, is partner fs hours 
at work in the given period, t, is nonpartner labor hours used by i, k, is 

2 This model draws on that contained in Gaynor (1989). 

3 Lee (1988) has also studied compensation in large group practices, and Feldman, 
Sloan, and Paringer (1981) have investigated this relationship for hospital-based physi¬ 
cians. But neither study has investigated the effect of compensation structure on per¬ 
formance or estimated frontier functions. 

* Some empirical support for this assumption is provided by the fact that in the data 
sample employed in this study, less than 35 percent of physicians indicated that they set 
their own hours. We examine the statistical validity of this assumption in the empirical 
section that follows. 

3 Outnut is assumed homogeneous. Although output may truly be heterogeneous 
and coriipensation structure will affect the quality of service (as shown in Gaynor 
[1989]), the incentives for efficiency in production are unchanged by heterogeneity of 
the product. 
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capital service hours used by i, e, is i's effort, and 8 , is a vector of i’s 
characteristics that affect productive efficiency. The function / is as¬ 
sumed to be strictly concave in all inputs. It is assumed that effort 
increases the marginal productivity of all factors of production. 

Given h„ t„ and k, chosen by the firm and given the fact that 8, is 
exogenous, the partner's choice of effort determines the quantity 
produced. This choice maximizes his or her utility, and utility in turn 
depends directly on the net income the physician receives and in¬ 
versely on the level of effort and hours applied. The utility function is 
assumed to be linear in money and additively separable in money and 
effort or hours: 


u. = y, ~ v,(e„ h t ), (2) 

where u, is i’s utility, y, is i’s net income, and v, is the private nonmone¬ 
tary cost of effort and hours; v, is assumed to be strictly convex in e, 
and The compensation structure determines y, for each partner 
and is described in stylized fashion by 

n 

y, = a (P - C)q, + - (1 - a)(P- C) X (3) 

” . = i 


where a is the proportion of net income generated by i that he 
“keeps,” a G [0, 1]; P is the price of output; C is the average cost of 
nonphysician inputs, assumed constant over output; and n is the 
number of partners in the firm. Thus the first term in (3) is the 
portion of net income generated by i that he keeps, and the second 
term is his share from the firm’s net income sharing pool. 6 

Maximization of utility yields the first-order condition, 


a + — (1 — a) 
n 


(P - C) 


sf(-) 

de, 


dVj(’) 

de. 


= 0. 


(4) 


The second-order condition (not shown) also holds, given the as¬ 
sumptions made about the functions u„ v„ and f. Together these 
define equilibrium effort as a function of the exogenous incentive 
parameters, i’s characteristics, and all other inputs (given that they are 
taken as fixed by i): 

e * = £>(<*’ p < c - 8.. K ti, k,). (5) 

Equation (4) can be readily interpreted as indicating that the utility- 
maximizing level of effort is the level at which the marginal net in¬ 
come product of effort (the first term in [4]) is equal to its marginal 


6 This form is highly simplified; real-world compensation structures often have dif¬ 
ferent shares for revenue and cost, and nonphysician average cost is not necessarily 
constant. 
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utility cost (the second term in [4]). 7 Figure 1 illustrates this. Utility is 
maximized at the point ( E ) at which the marginal net income product 
of ef fort curve, 11, and marginal utility cost of effort curve, CC, cross. 

Examining some comparative static derivatives for the equilibrium 
shows the effects of changes in a, P, C, n, or 0, on the optimal choice 
of Table 1 contains the results. These can also be determined by 
examination of the effects of any of these variables on curves 11 and 
CC in figure 1. 

It is clear from these results that a partner will be responsive to 
changes in the compensation structure and other variables set by the 
firm. The effect of an increase in a is to increase productivity," or . 
measured efficiency, given measured levels of all inputs and given 
output price, because the increase will call forth higher levels of the 
unmeasured input effort, e,. This can be seen by substituting the 
equilibrium effort function (5) into the production function (1): 

q, = f(h„ t„ k„ g,( a, p, c, n, e„ h n ta 'k,);Q,). (T) 

Since the incentive parameters a, P, C, and n influence the amount of 
effort supplied, they thus affect the quantity produced. Equation (!') 
is a reduced form that represents behavior actually observed. Call this 
the “behavioral production function.” 8 Equation (1), in contrast, rep- 


7 We make the assumption here that there is enough demand at the price the group 
sets to lliow the physician to satisfy the first-order condition. 

* TBs is essentially a formal derivation of Jensen and Meckling's (1979) production 
function, which is defined conditional on property rights within the hrm. McMillan, 
Whalley, and Zhu (1989) have also independently drawn this distinction between be¬ 
havioral and technical production functions. 
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.1 


Variable 

Comparative Static Derivatives 

Sign 

a 

-[1 - (1 tn)](P - C)dflfa, 
d\/df 

+ 

p 

-{a + [(l/n)( 1 - a)]}df/de, 

+ 

c. 

{a + ((l/n)(l - a)]}Sflde, 
a 2 u,/ae? 

- 

n 

[(1 - a)ln*](P- C)dJ/de, 
d 2 ujde? 

- 

9 , 

-(a + [(l/n)(l - «)]}(/> - C)d 2 Jldt,M, 

+, -, 0* 


* sgnf« strill ] If w< think of 0, us an individual's ability, then there is plausibly a greater relurn 

in additional effort assoiialed svith greater abilitv, which implies if*//dr,ad, > 0 and thus dr,*/i)0, > 0 


resents the "technical production function,” that is, the relationship 
between inputs and maximum outputs dictated by technology. 

Let f| nax be the maximum value in {f,}. Then call 

qf' - f(K t„ k„ e'r\ 6 ,) ( 6 ) 

the "efficient” or “maximum (observed) effort" production function. 
This indicates the levels of output possible over the ranges of the 
other inputs, h„ t„ and k„ given that e, = <?, max . This is a frontier 
production function. 

We define a measure of technical efficiency as the ratio of observed 
output to the efficient output: 

R m q i _ f(h„ t„ k„ g„ 6,) 

’ q?' f(h„ t„ k„ e'r\ e,)' 

Since / is monotonically increasing in e„ R, £ [0, 1]. 

Two related points must be noted here. Measuring a firm as less 
than fully technically “efficient” (R< < 1) does not imply that it is 
inefficient in a welfare sense. As indicated earlier, effort, e„ is chosen 
to maximize utility, given the agent’s preferences and the compensa¬ 
tion structure. If the compensation structure is optimal, this choice is 
Pareto efficient, by definition. Those agents who choose an e, < gj" ax 
have made efficient choices, but ones that do not maximize output, 
given all other inputs. Therefore, the measure of technical efficiency 
measures the shortfall from a production frontier defined in terms of 
technically possible, but not necessarily Pareto-efficient, levels of out- 
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put. Indeed, with subjective costs to effort, it will generally not be 
welfare optimal to be technically efficient. 

Since deviations of effort from the maximum are due in part to 
agents’ taste for effort, the technical efficiency measure captures these 
preferences. In fact, the distribution of the R, over the agents reflects 
the distribution of the agents’ tastes for effort. Consequently, we can 
think of this measure as measuring agents’ preferences for effort, 
other things (including incentives) held constant. Those with a 
greater preference for effort will be measured as more technically 
efficient, and vice versa. 

Effort cannot be observed directly but can be measured empirically 
by estimating the production function as a frontier production func¬ 
tion and then calculating a measure of technical efficiency that corre¬ 
sponds to R,. Since the e? iax are not observed, we measure effort by 
defining the efficient production function (6) for the sample and 
comparing individual observations to this frontier. 

B. Endogenous Incentives 

In this paper we do more than simply estimate a frontier function. We 
take account of the endogeneity of the incentive devices that are 
hypothesized to affect effort. We therefore analyze measured pro¬ 
ductivity by using both endogenous explanatory variables and a fron¬ 
tier function technique. Correct treatment of the error term in a re¬ 
gression equation requires that these approaches be combined. 

To see why this is so, note that a conventional ordinary least squares 
(OLS) regression line can suffer from simultaneous equations bias. 
This bias will arise if agents differ in their responsiveness to incentives 
and they can choose the form and level of incentives they face. 

Suppose that all agents have equal ability in the sense that, with a 
given level of effort and with given levels of all other inputs, equal 
outputs will be observed. However, since the level of effort is not 
equal when compensation structures differ and is not directly observ¬ 
able, measured productivity can still differ. Suppose also that agents 
differ across groups in their responsiveness to financial incentives, 
that is, in their willingness to trade off effort for financial reward. 
When faced with payments that fully correspond to the revenue for 
their services (a = 1), all agents in all groups are equally productive; 
as a falls below unity, productivity falls, but at different rates for 
different groups of agents. Figure 2 shows several possible “incentive- 
productivity” curves; the dashed line plots the curve for an individual 
of average responsiveness. 9 

M 

9 The incentive-productivity curves in fig. 2 illustrate one possible configuration. 
They do not have to cross at a = 1, nor must they cross at all. 
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Fig. 2.—Incentive productivity curves for agents of differing responsiveness 


Since reducing a below unity reduces productivity, why would any 
firm choose a value less than one? One may conjecture that, for risk- 
averse individuals facing a situation in which actual productivity is 
affected by random events, some implicit insurance may be chosen. 
Having sicker patients show up at one’s office may reduce produc¬ 
tivity-based income, but receiving part of one's compensation as a sal¬ 
ary independent of productivity guards against the risk (see Gaynor 
[1983] for a proof). 

If individuals have similar attitudes toward risk, one would then 
expect the level of a chosen by the members of a partnership to be 
inversely related to their common degree of responsiveness to fi¬ 
nancial incentives. Actual observations might cluster as the x’s in 
figure 2, and the estimated a would be indicated by the slope of the 
line EE 1 , which does not even have the same sign as any of the true 
values of the marginal product of effort. All that is needed for this 
sort of bias to occur is variation across individuals in their elasticity of 
supply of effort in response to incentives. 

The solution to this problem is to find identifying variables and to 
treat a as endogenous. Our data set does contain a number of such 
variables. 
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III. Empirical Methodology 

The literature on production frontiers and comparative efficiency 
(see Aigner, Lovell, and Schmidt 1977; Meeusen and van den Broeck 
1977; F 0 rsund, Lovell, and Schmidt 1980; Greene 1980) indicates 
that technical efficiency can be estimated via econometric techniques 
as a means of comparing actual output to the output that would result 
from a “best practice” frontier that corresponds to the most efficient 
set of observations. Let 


q, = f(X,)u,v„ ( 6 ') 

where X, is input, v, is a random shock, and it, is a multiplicative error 
term representing efficiency, that is, 


u, = 




u, e [0, l]. 


Since the production function/represents the frontier along which 
decisions are efficient in the conventional treatment, the error com¬ 
ponent u, must be constrained to be nonnegative. 

Consider a basic production model 


<7- - X + € '’ (?) 

> 

where the X,, represent the; inputs, and the e, are error terms. The 
stochastic frontier model begins by assuming that the error term is 
decomposed into two independent components, 

e, = u, + v„ u, s 0 . ( 8 ) 

The error components u, are one-sided, nonpositive errors derived 
from a normal distribution with mean zero and variance 07 ,, which is 
truncated from above at zero. This one-sided error component forces 
output to lie on or beiow the production frontier. Thus any deviations 
from technical efficiency will be captured by the u,. 

The v, are two-sided errors representing statistical noise, and v, ~ 
jV( 0, a*). The u, and v, are assumed to be independent. 

Using Weinstein’s (1964) derivation of the distribution function 
of the sum of a symmetric and a truncated normal random variable, 
Aigner et al. derive the log likelihood function that can be used to 
estimate the parameters of interest. For n observations, it can be writ¬ 
ten as 


In L{q\$, k, or 2 ) 



V2 , - 1 

n —=- n In tr 
Vtt 

+ ^ In [1 - fcle.Atr'" 1 )] - 

1= I 



( 9 ) 
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where <r 2 = ct 2 + a‘i, X = o,J<j v , and $> represent^ the standard normal 
distribution function. The maximum likelihood estimates are consis¬ 
tent and asymptotically efficient. 

We estimate the medical practice production function in a tran¬ 
scendental form, as suggested by Reinhardt (1972): 

9, = A f] (X J( p ; ) expj^ y k Y kl + X 8*(K*,) 2 J exp(6,), (10) 

where the X,, and Y k , are inputs. This is a special case of the general¬ 
ized power production function as described by Reinhardt and by 
de Janvry (1972). It is a flexible functional form, with the desirable 
properties that positive output can be achieved without some of the 
inputs (the JVs) and that the elasticity of scale is variable, as are the 
elasticities of substitution. 

Once the stochastic frontier production function has been esti¬ 
mated, a measure of “technical efficiency” corresponding to R, can be 
calculated. Afriat (1972) has suggested using exp(u,) as the measure 
of technical efficiency. In terms of the production function (10), this is 

eX p(u,) = -—-- - -----. ( 11 ) 

A n< X /-3/) exp X y k Y k + X 8 *^*-) 2 ex P( 1 '-) 

, L * * 

that is, the ratio of observed output to the stochastic frontier. The 
term exp(u,) takes on values in the inclusive range zero to one. The v„ 
however, are unobservable, so this measure cannot be computed for 
each observation. The mean technical efficiency, however, can be 
computed. Lee and Tyler (1978) show that when w has a truncated 
normal distribution (i.e., e“ has a truncated lognormal distribution), 

£[exp(u)] = 2 exp^~-j[l - <f>(o, t )], ( 12 ) 

where <l> is the standard normal distribution function. This is a mea¬ 
sure of the shape of the distribution of the a,. The term o„ can be 
calculated from the estimated parameters cr 2 and X. Aigner et al. 
suggest using X = and E(u) = £(e) = - (V 2 /Vit)< 7 u as measures 
of average inefficiency; consequently we calculate these as well. 

As indicated earlier, the compensation structure of the firm will 
affect production, and the compensation structure itself is related to 
agents’ preferences for leisure. Thus the variables that measure the 
compensation structure of the firm are endogenous. Our approach to 
this problem is to generate fitted values for the endogenous compen¬ 
sation structure measures from first-stage equations and use the fitted 
values as instruments in the production function (adjusting the esti¬ 
mates of the standard errors). 
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The data utilized for this study were assembled by Mathematica Policy 
Research, under contract to the National Center for Health Services 
Research, U.S. Department of Health and Human Services. The bulk 
of the data set is composed of surveys conducted by Mathematica, 
although some secondary data sources have been incorporated. Dur¬ 
ing the period March-June of 1978, Mathematica conducted a 
nationwide survey of medical group practices. The final sample in¬ 
cluded 957 groups and 6,353 physicians practicing in those groups. 
The sample was stratified by group size, type of group (multispecialty 
or single specialty), physician specialty, and prepaid versus fee for 
service. Large group practices were oversampled in an effort to sup¬ 
ply a reasonable number of observations, and a census was taken of 
prepaid groups for the same purpose. Further, only five medical 
practice specialties were sampled: general practice, internal medi¬ 
cine, pediatrics, general surgery, and obstetrics/gynecology. Approxi¬ 
mately 60 percent of all office-based physicians practice in these spe¬ 
cialties. 

This data set also includes data measuring characteristics of the 
area in which the group practiced and data on the hospital with which 
the group is affiliated. The data on area characteristics were obtained 
from many sources, including the American Medical Association, the 
County and City Data Book, and various other sources. For a complete 
description of all these data sources, see Boldin et al. (1979). The 
hospital data were obtained from the American Hospital Association 
guide for 1978. 


V. Estimation and Results 

In this section we first discuss the estimation of traditional versus 
behavioral production functions and describe the relationship be¬ 
tween the theoretical variables and those used in the empirical im¬ 
plementation. We then discuss the estimation strategy and report the 
results. Table Al in the Appendix presents the names and definitions 
for the variables employed in estimating the production frontier. 
Table A2 presents the. means and variances of those variables. 

A. Traditional and Behavioral Production Functions 

We estimated two versions of the production function: the “tradi- 
tiona lla fpduction function and the "behavioral” production function 
we havFaescribed. The traditional production function describes the 
purely technical relationship between physical inputs and physical 
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output. The inputs are physician hours, h„ aide t'iine, and capital, A,. 
The "behavioral” production function represents the behavioral rela¬ 
tionship between inputs and output as determined by the internal 
organization of the firm. This corresponds to equation (T): 

• <?. “ f(hi, h, k t , gi(a, P, C, n, 0*. h„ t,, *,), 0,). (T) 

Consequently, incentive variables enter as well as physical inputs. 
Since incentives determine the behavior of individual agents and pro¬ 
duction is an outcome of an agent’s actions, clearly the production 
process we observe is a behavioral production process and, as such, is 
characterized by the production relation in equation (T). 

Since (T) characterizes the nature of observed production, esti¬ 
mates of the traditional production function, = f(h„ t„ k,), will not in 
general provide us with accurate measures of marginal products or 
returns to scale. The reason is that omitting unmeasured effort may 
bias the estimates. In addition, the traditional specification neglects 
the effects of (correlated) incentives on behavior. 0 We therefore esti¬ 
mated both versions and present the results for purposes of compari¬ 
son, to indicate the sensitivity of such estimates to the new method we 
have outlined. 

B. Variables 

The variables included in the "traditional” specification correspond to 
the variables h„ t„ and A,. The measure of output is the number of 
office visits per week for primary-care physicians, for whom office 
visits are a large proportion of total practice. Physician hours in the 
office per week correspond to h,. The aide time variable is repre¬ 
sented by the number of hours of nonphysician medical personnel 
per week, administrative personnel per week, and the square of the 
sum of the two measures. The number of examining rooms per physi¬ 
cian in the group is a proxy for capital. Also included are dummy 
variables for physician specialty and the presence of a graduate physi¬ 
cian assistant. The “behavioral” specification of the production func¬ 
tion includes the variables in the “traditional" specification, with ad¬ 
dition of the empirical equivalents of the incentive variables a, P, C, 
and n and some additional variables hypothesized to affect produc¬ 
tivity ( 0 ,). 

We entered the incentive variables in two different forms: sepa¬ 
rately and jointly. The separate incentive variables are the compensa- 


10 In effect the traditional specification amounts to restricting the coefficients on the 
incentive variables to be zero. The results from the behavioral specification clearly re¬ 
ject this hypothesis. 
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tion scale, average price for an office visit in the area, wage rate for a 
registered nurse, and the number of full-time-equivalent physicians 
in the group. The compensation scale variable varies between one and 
10, where a value of one indicates that the physician regarded his 
compensation as completely unrelated to productivity and a value of 
10 indicates a perfect relationship. 11 This corresponds to the theoreti¬ 
cal variable a. The average price in the area for an office visit is the 
measure of P. We use this measure rather than the price the physician 
reported as charging for an office visit because of concern over mul- 
ticollinearity. 12 The average wage paid a registered nurse in the 
group is a proxy for the average cost variable specified in the theory, 
C. Although a measure of cost is available, in practice it is plagued by 
missing values. The number of full-time-equivalent physicians in the 
group corresponds to the theoretical variable for group size, n. 

In addition to entering the incentive variables separately, we con¬ 
structed a measure corresponding to the joint incentives presented by 
these variables: 


a + i- (1 - a)j (P - C). 

We call this variable the joint incentive variable. In order to create this 
variable, the compensation scale variable was divided by 10 to make it 
vary between zero and one, and a proxy for average cost was con¬ 
structed. The average nurses’ wage was multiplied by the number of 
nurse hours used by the physician and then divided by the number of 
office visits. This provides a measure of average labor costs to the 
physician. Since the production of office visits is labor, rather than 
capital, intensive, this is a reasonable proxy for average cost. 


1. Endogenous Variables 

A number of the independent variables in the production function 
are potentially endogenous. Specifically, we think that the compensa¬ 
tion scale, group size, the joint incentive variable, the group’s being 50 
percent prepaid, and physician hours may not be exogenous. The 


11 The compensation scale is highly correlated with other measures of the compensa¬ 
tion system. The simple correlation between the compensation scale and the percentage 
of comptaisalion that is based on productivity is .91. The correlation between the 
compenslmn scale and the change in net income per $ 1,000 of patient billings is .96. 

‘*The fee charged by the physician for an office visit is significantly correlated at 
better than a 1 percent confidence level with the compensation scale, the wage rate, and 
group size. When this variable was used in the regression, its coefficient was either 
negative, insignificant, or both, although a specification test did not reject exogeneity. 
QMInsequently, we used the area price as a proxy. It is highly correlated with the in¬ 
dividual physician’s fee, but not with the other variables. 
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physician may have an effect on the incentive and prepaid variables in ■ 
two ways: his role in decision making within the group and his choice 
of which group to join. In addition, if the physician has any discretion 
in setting his own hours, physician hours will be correlated with 
omitted physician preferences. We employed the exogeneity test 
of Durbin (1954), Wu (1973), and Hausman (1978) 13 to examine 
whether these variables can be treated as uncorrelated with the error 
term in the regression. As a result of these tests, the exogeneity of the 
compensation scale, the number of physicians in the group, and the 
joint incentive variable are rejected at the 1 percent, 5 percent, and 1 
percent levels, respectively. The exogeneity of the group’s being 50 
percent or more prepaid and of physician hours cannot be rejected. 

Since the compensation scale, number of physicians in the group, 
and the joint incentive variable are probably not exogenous, we 
created instruments for these variables. The instruments are fitted 
values from first-stage regressions that include all the explanatory 
variables in the production function plus a vector of physician tastes 
and characteristics that serve as identifying variables. 

C. Estimation 

The traditional and behavioral frontier production functions were 
estimated using maximum likelihood, using instrumental variables to 
correct for endogeneity. We also estimated traditional and behavioral 
production functions using two-stage least squares (2SLS). The max¬ 
imum likelihood estimates can be compared to the least-squares esti¬ 
mates as estimates of a frontier, as opposed to an average, production 
function, a la Lee and Tyler (1978). The other comparisons we make 
are of the parameter estimates, marginal products, and elasticities of 
scale from the “traditional” versus the behavioral production func¬ 
tion. The traditional and behavioral production functions were also 
estimated by OLS for the purpose of comparison with the instrumen¬ 
tal variables estimates. 


D. Results 

Table 2 shows the maximum likelihood and 2SLS estimates of the 
production function; the first-stage estimates are presented in table 3. 
The OLS estimates are presented in table A3 in the Appendix. The 
signs of the coefficients in ail regressions are generally as expected. 
One exception is the number of examining rooms, which is insignifi- 

13 See Nakamura and Nakamura (1981) for an exposition of the equivalence of these 
tests. 
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Presence of graduate physician assistant -.95*** 5.18*** -1.13*** 

(.25) (1.08) (.355) 

Group is more than 50 percent prepaid -1.193* —2.69 .035 

(.62) (2.70) (.882) 

Lack of importance of: 
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cant in all but one regression. It may be that the number of examining 
rooms per physician functions as a poor proxy for the flow of capital 
services. The other inputs—physician time, aide time, and adminis¬ 
trators’ time—all have positive and significant coefficients. Experi¬ 
ence has a positive but diminishing effect. This is consistent with a 
hypothesis that greater experience leads to greater productivity but is 
counteracted by increasing age. 

The coefficients are very similar across the specifications. The 
coefficients from the average production function (2SLS) do not ap¬ 
pear to be significantly different from those of the stochastic frontier 
(maximum likelihood) estimates. The intercept term from the fron¬ 
tier, however, is slightly larger than the intercept from the average 
function, as it should be, indicating a greater level of technical 
efficiency. Thus it appears that more efficient firms achieve their 
efficiency by a “neutral” application of effort, which does not affect 
the shape of the production function, but only shifts it. 14 

The estimates for the parameters common to both the traditional 
production function (col. 3 in table 2 and col, 1 in table A3) and the 
behavioral production function (cols. 1, 2, 4, and 5 in table 2) are very 
similar. The only difference is that the capital proxy, the number of 
examining rooms per physician, is significant only in the traditional 
specification. This similarity does not, however, imply that the charac¬ 
teristics of production, such as returns to scale or marginal products, 
are the same for the traditional and the behavioral specifications of 
the production function. In fact, they are different, as we indicate 
later. 

The estimates of the parameters of the behavioral production func¬ 
tion utilizing the incentive variables separately and the joint incentive 
variable are also very close. Overall, the results appear to be quite 
robust to choice of variables and estimation technique. Consequently, 
we shall discuss the behavioral aspects of our results by using the 
maximum likelihood estimates of the behavioral frontier production 
function with the incentive variables entered separately since this al¬ 
lows us to analyze their individual effects. These results are contained 
in column 4 of table 2. 

The coefficient for the compensation scale variable is positive and 
significant, as hypothesized. An increasingly strong link between com¬ 
pensation and productivity does lead to the production of more office 
visits per week. This is consistent with the finding of Reinhardt et al. 

11 Since both least-squares and maximum likelihood frontier estimates of the slope 
parameters are consistent, but least squares is less efficient, we do not expect these 
estimates to be significantly different. On the other hand, the least-squares estimate of 
the constant term is biased and inconsistent, so the least-squares and maximum likeli¬ 
hood estimates of this parameter should differ, as they do. 
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(1979). Specifically, as the value of the compensation scale moves 
from its minimum at one to its maximum at 10, output will increase by 
28 percent. This is as hypothesized for a firm with nonjoint produc¬ 
tion, where output can be observed but effort cannot. 

The effect of price on output is positive, as hypothesized, but not 
statistically significant. The lack of significance is probably due to the 
fact that the price variable is an area average rather than being 
specific to the group. lr ’ The coefficient for the wage rate is negative 
and significant. A one-dollar increase in the wage rate will decrease 
output by 1.5 percent. This is consistent with the hypothesis that 
higher costs decrease output, where the wage rate is a proxy for 
average cost. The number of physicians in the group has a large 
negative and strongly significant effect. A 10 percent increase in 
group size decreases output by over 1.7 percent. This is consistent 
with the hypothesis that, ceteris paribus, incentives are diminished 
with increased group size. 

The joint incentive variable is positive and strongly significant, indi¬ 
cating that agents respond to an overall increase in incentives by 
increasing production. Whether or not the group is multispecialty or 
largely prepaid seems to have little effect on output. 

The first-stage estimations presented in table 3 also contain some 
interesting results. The physician preference variables tend to have 
the signs that intuition would suggest. We expect that physicians with 
a preference for hard work or productivity-related rewards should 
locate in groups in which compensation is strongly related to produc¬ 
tivity, or lead groups to adopt a high value of the compensation scale. 
That is what the empirical results indicate. The lack of alleged impor¬ 
tance of regular income is positively related to the compensation scale 
and to the joint incentive variable, indicating that the less important 
regular income is to a physician, the more strongly related to produc¬ 
tivity his group’s compensation structure is. This may measure the 
impact of risk aversion on compensation method. The lack of alleged 
importance of productivity related to income is negatively related to 
the compensation scale and the joint incentive variable. (Here “impor¬ 
tance” is interpreted as the physician’s subjective feelings about the 
propriety of relating income to productivity.),This indicates that the 
more important a physician deems relating productivity to income, 
the stronger will be the relation between compensation and produc¬ 
tivity. As preferred group size rises, so does group size. The larger 


the group size deemed best for protection from financial loss, the 
larger the physician’s group is. This is the only one of the group size 
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TABLE 4 


Maximum Likelihood Estimates of Efficiency Statistics 




Specification 


“Traditional” 

(1) 

"Behavioral," 

Incentive 

Variables 

Separately 

( 2 ) 

“Behavioral,” 

Joint 

Incentive 

Variable 

(3) 

\ 

1.267 

1.269 

1.219 

<Z« 

.364 

.357 

.345 


.227 

.222 

.232 

1 - £(«) 

.519 

.523 

.532 

E(e~“) • 

.656 

.658 

.662 


Sol'kcl —Co) l: col 3 of table 2, col 2‘ col 4 of table 2, col. 3 col !> of table 2 


preference variables that is statistically significant. The physician’s 
self-reported responsiveness to financial incentives had no statistically 
significant effect on any of the dependent variables. Board certifica¬ 
tion status, thought to be a proxy for physician ability, was also uni¬ 
formly insignificant. 

The estimates of technical efficiency and related parameters are 
presented in table 4. The average level of technical efficiency, 
ranges from 0.656 to 0.662. This may appear to be quite low, indicat¬ 
ing considerable inefficiency, but it is no lower than estimates from 
some other studies of firms in other industries. Ib Interestingly 
enough, the measure of average technical efficiency is insensitive to 
the specification. It takes the value 0.656 when the traditional produc¬ 
tion function is estimated and the value 0.658 when the behavioral 
production function is estimated with incentive variables added sepa¬ 
rately. 

If observed differences in efficiency are due to differences in the 
taste for effort, as we hypothesize, then this similarity is as expected. 
The reason for this is that the distribution of tastes is what leads to 
the distribution of observed technical efficiency, and this will be 
unchanged by incentives. 

Calculations of the marginal products of the “physical” inputs from 


lb For example, Greene (1980) estimates technical efficiency for the U.S. primary 
metals industry as 0.6454; an estimate of technical efficiency equal to 0.546 for U.S. 
electric utilities can be calculated from Stevenson's (1980) paper. Lee and Tyler esti¬ 
mate technical efficiency for Brazilian manufacturing firms as 0.625, and Pitt and Lee 
(1981) estimate efficiency in Indonesian weaving as 0.618. The congruency of these 
measures with ours suggests that the dispersion of the taste for effort among phy sicians 
may be no different from that in some other industries. 
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both the traditional and behavioral production functions and of elas¬ 
ticities of scale reveal that there are differences between the measures 
calculated from the traditional and from the behavioral production 
functions. The behavioral measures of marginal products and elastic¬ 
ity of scale increase markedly as the compensation scale increases. 
The complete calculations are contained in an appendix available 
from the authors. 


VI. Summary and Conclusions 

The goal of this paper was to investigate the determinants of produc¬ 
tive efficiency in partnerships. In order to examine this matter, we 
derived a behavioral production function from individuals’ responses 
to incentives and related it to the frontier production function from 
the econometric literature. This provides a theoretical basis for this 
econometric technique. We have also demonstrated that it is not pos¬ 
sible in general to recover the parameters of the underlying produc¬ 
tion technology when production decisions are determined through 
behavioral responses to incentives. 

Empirically, we find that incentives affect the quantity produced, 
but not measured technical efficiency, because increased incentives 
call forth a greater supply of effort from all agents but do not change 
agents’ tastes for effort. Specifically, relating compensation to produc¬ 
tivity does increase production, as theory would suggest. The number 
of members in a group decreases the quantity produced, and experi¬ 
ence leads to greater productivity. 

Overall, the empirical results are highly consistent with theoretical 
work on the internal theory of the firm, which predicts that produc¬ 
tivity-based compensation schemes will work well for firms with non¬ 
joint production and observable output. These two criteria are met by 
medical group practices. Further research on the relationship be¬ 
tween behavioral and technical production functions and examina¬ 
tions of efficiency and its determinants for other services and other 
types of compensation systems would be illuminating. 





Appendix 


TABLE A] 

Variable Names and Definitions 


Variable 


Definition 


ln(number of office visits 
per week) 

Compensation scale 


Average price for an 
office visit 

Wage of a registered 
nurse 

ln(number of physicians 
in the group) 

Joint incentive variable 

Group is more than 50 
percent prepaid 
ln(physician office hours 
per week) 

ln(number of examining 
rooms per physician) 
Hours of nonphysician 
medical personnel per 
week 

Hours of administrative 
personnel per week 
Nonphysician personnel 
hours squared 
Physician experience 

Physician experience 
squared 

General practice, pediat¬ 
rics, obstetrics/gyne¬ 
cology 

Responsiveness to 
financial incentives 
Multispecialty group 


Natural log of the number of office visits per week 
A scale varying between one and 10, increasing with 
the strength of the relation between compensation 
and productivity 

The mean price charged for an office visit in the 
county in which the physician is located 
The mean wage paid to registered nurses in the 
group 

Natural log of the number of full-time-equivalent 
physicians in the group practice 
A variable measuring the joint effect of the compen¬ 
sation scale, price, average cost, and group size 
Dummy variable indicating if 50 percent or more of 
the group's revenues are prepaid 
Natural log of the number of physician hours per 
week 

Natural log of the number of examining rooms per 
full-time-equivalent physician 


Total hours of nonphysician medical personnel 

Total hours of administrative personnel 

Total hours of nonphysician medical personnel plus 
hours of administrative personnel squared 

Number of years since graduation from medical 
school 

Number of years since graduation from medical 
school squared 

Physician specialty dummies for general practice, pe¬ 
diatrics, and obstetrics/gynecology, respectively (in¬ 
ternal medicine is excluded) 

Whether the physician judges himself responsive to 
financial incentives 

Dummy variable for whether the group is multi- or 
single-specialty. 






TABLE A1 ( Continued ) 


Variable 


Definition 


Presence of graduate 
physician assistant 
Lack of importance of 
regular income to phy¬ 
sician 

Group size best providing 
regular income 
Group type best provid¬ 
ing regular income 
Lack of importance of 
protection from finan¬ 
cial loss in practice 
Group size best providing 
financial protection 
Group type best provid¬ 
ing financial protection 
Lack of importance of 
productivity related to 
income 

Group size best relating 
productivity to income 
Group type best relating 
productivity to income 
Lack of importance of 
costs related to income 
Group size best relating 
costs to income 
Group type best relating 
costs to income 
Lack of importance of 
regular hours 
Group size best providing 
regular hours 
Group type best provid¬ 
ing regular hours 
Preferred group size 
Board certified 


Dummy variable for whether there is a graduate phy¬ 
sician assistant 

Varies between one and four, increasing with lack of 
importance 

Varies between one and four, increasing with group 
size 

Dummy variable taking value one if type is a health 
maintenance organization (HMO) 

Varies between one and four, increasing with lack of 
importance 

Varies between one and four, increasing with group 
size 

Dummy variable taking value one il type is an HMO 

Varies between one and four, increasing with lack of 
importance 

Varies between one and four, increasing with group 
size 

Dummy variable taking value one if type is an HMO 

Varies between one and four, increasing with lack of 
importance 

Varies between one and four, increasing with group 
size 

Dummy variable taking value one it type is an HMO 

Varies between one and four, increasing with lack of 
importance 

Varies between one and four, increasing with group 
size 

Dummy variable taking value one il type is an HMO 

The size preferred by the physician 

Dummy variable indicating if the physician is board 
certified 
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TABLE A2 

Variable Means and Variances 


Variable 

Mean 

Variance 

ln(number of office visits per week) 

4.46 

.60 

Compensation scale 

6.23 

3.44 

Average price for an office visit 

6,99 

6.31 

Wage of a registered nurse 

4.17 

2.09 

ln(number of physicians in the group) 

2.93 

.60 

Joint incentive variable 

6.33 

4.83 

Group is more than 50 percent prepaid 

.04 

.19 

ln(physician office hours per week) 

3.22 

.48 

ln(number of examining rooms pet physician) 

.72 

.87 

Hours of nonphysician medical personnel 



per week 

54.54 

20.69 

Hours of administrative personnel per week 

63.90 

29.03 

Nonphysician personnel hours squared 

15,552.55 

10,724.56 

Physician experience 

18.70 

10.01 

Physician experience squared 

449.93 

370.09 

General practice 

.31 

.46 

Pediatrics 

.18 

.38 

Obstetrics/gynetology 

.13 

.33 

Responsiveness to financial incentives 

.32 

.47 

Multispecialty group 

.65 

.48 

Presence ol graduate physician assistant 

.26 

.44 

Lack ol importance of regular income to 



physician 

2.19 

.79 

Group size best providing regular income 

3.16 

.82 

Group type best providing regular income 

.25 

.43 

Lack of importance of protection from financial 



loss in practice 

3.03 

.95 

Group size best providing financial protection 

3.01 

.99 

Group type best providing financial protection 

.19 

.39 

Lack ot importance of productivity related 



to income 

1.89 

.84 

Group size best relating productivity to income 

2.33 

1.21 

Group type best relating productivity to income 

.02 

.14 

Lack of importance of costs related to income 

2.32 

.84 

Group size best relating costs to income 

2.31 

1.10 

Group type best relating costs to income 

.15 

.35 

Lack of importance of regular hours 

2.15 

.76 

Group type best providing regular hours 

.20 

.40 

Group size best providing regular hours 

3.15 

.81 

Preferred group size 

14.65 

20.20 

Dummy for hoard certification 

.72 

.45 
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Liability and Large-Scale, Long-Term Hazards 


Al H. Ringleb 

Clemson University 


Steven N. Wiggins 

Texas AtsfM University 


This paper analyzes the application of liability to large-scale, long¬ 
term hazards. The key features distinguishing such hazards are the 
long temporal separation between exposure to a hazard and disease 
and the large damages when injuries finally emerge. The large scale 
of damages creates a strong incentive to avoid liability payments, and 
the long temporal separation creates numerous avenues through 
which parties can avoid paying possible damage awards. The analysis 
focuses on the incentive to avoid paying damages by vertically divest¬ 
ing production tasks associated with serious occupational risks. Such 
divestiture can lower liability costs if the small firm operating the 
risky stage goes out of business before latent injuries emerge or has 
insufficient assets to pay damages and declares bankruptcy when 
suits are filed. The paper then presents an empirical regression anal¬ 
ysis of small-firm entry into the U S. economy between 1967 and 
1980, the period in which liability laws were changing. The point 
estimate is that, ceteris paribus, liability changes appear to have led 
to a large increase in small corporations in hazardous sectors. Hence 
the empirical analysis shows widespread attempts to avoid liability by 
shielding assets through divestiture, 
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I. Introduction 

Exposure to large-scale, long-term hazards has recently emerged as a 
pressing policy concern. Consumers and workers have been exposed 
to health risks associated with radiation, DES, cigarette smoking, sac¬ 
charin, occupational carcinogens, asbestos, dioxin, vinyl chloride, 
PCB, beta-napthylamine, benzidine, coke-oven emissions, and nu¬ 
merous other latent hazards. Recent studies have also linked occupa¬ 
tional cancer to such basic materials as petroleum, coal, paraffin, 
wood, leather, iron ore, nickel, and chromium, all of which are funda¬ 
mental inputs to production (Cole and Merletti 1980). These linkages 
suggest the likely exposure of hundreds of thousands of workers to 
possible and known carcinogens. Given the pervasive nature of the 
problem, an important economic question is how to institutionally 
govern compensation for workers injured by exposure to such haz¬ 
ards. 

Tort liability, for good or ill, is quickly emerging as a primary 
institutional form. 1 A key advantage of a liability system is that such a 
system is injury-based. a Such a system permits compensation for in¬ 
juries to be made after parties determine the extent of damages as¬ 
sociated with hazardous products, and other activities. Accordingly, 
liability can produce both appropriate incentives for firms to pro¬ 
vide safety and implicit insurance to injured workers or consumers 
(Spence 1977). 

With latent hazards, liability appears particularly attractive because 
it permits injury-based compensation in an area in which dangers are 
(arguably), difficult to foresee. Despite its increasingly widespread 
adoption, however, there has been little evaluation of liability in a 
latent hazard setting. As a result, our understanding of how liability 
will perform rests largely on a simple extrapolation of analyses in 
acute injury settings (see, e.g., Oi 1973; Spence 1977). 

This paper analyzes the use of a liability system in the latent hazard 
setting. The paper argues that the attractiveness of liability is some¬ 
what superficial because the injury-based nature of the system leads to 
unique enforcement problems. In particular, the paper argues that a 
major option to minimize liability exposure is for firms to segregate 
risky activities in small corporations. Such segregation becomes valu¬ 
able when latent injuries later manifest themselves and the claimants 

' A number of other arrangements could also be used, including simple wage pre¬ 
miums, insurance, federal compensational programs, and direct regulation. See Wig¬ 
gins and Ringleb (1989) for a more comprehensive discussion of the alternative institu¬ 
tional forms. 

2 According to Wright (1944, p. 238), ‘The purpose of the law of torts is to adjust 
losses, and to afford compensation for injuries sustained by one person as the result of 
the conduct of another." 
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are restricted to the assets of the small corporation to pay the associ¬ 
ated liability damages. 

The primary focus of the paper is to present empirical evidence 
that suggests that such firm efforts to avoid liability are widespread. 
This empirical evidence is generated by econometrically analyzing the 
pattern of entry by small corporations into the U.S. economy over the 
period 1967-80. The analysis shows that changes in potential liability 
appear to be closely linked to substantia] increases in the number of 
small firms operating in hazardous sectors. The entry of these small 
firms, moreover, seems designed primarily to avoid liability. The im¬ 
plication is that liability, at least as currently imposed, may be difficult 
to enforce because these small firms have insufficient assets to pay 
substantial damage awards. Hence liability may not lead to large dam¬ 
age awards in long-run equilibrium, but instead may simply lead to a 
restructuring of enterprises to avoid damage payments. To the extent 
that this finding is verified, liability may perform significantly less well 
than existing analyses would suggest. The implication is that these 
enforcement problems need to be carefully addressed in continuing 
efforts to evaluate liability and to deal with the latent hazard problem. 

The remainder of the paper is organized as follows. Section II 
discusses how latent hazards differ from other hazards and then 
shows how these differences create an incentive for firms to spin off 
hazardous activities to minimize the exposure of assets to liability 
claims. Section III then shows how the change in liability rules of the 
early 1970s created incentives for such divestiture during that period. 
The section then goes on to examine how changing liability rules are 
linked to substantial changes in the rate of entry of small corporations 
in risky sectors of the economy, suggesting that divestiture may be a' 
significant problem. Section IV then summarizes the results and 
briefly examines several recent experiments designed to improve en¬ 
forcement. 


II. Latent Hazards, Small Firms, and the 
Enforcement of Liability r 

Latent hazards differ from other hazards both because of the severe 
nature of potential injuries and because of the long separation be¬ 
tween exposure to substances and the manifestation of injury. Both of 
these distinguishing features contribute to enforcement problems for 
a liability system. 

The hazards described in the introduction hold in common a sig- 
nificantjpotential to earose cancer, birth defects, and other major, life- 
threatening risks. The severity of these risks means that individual 
injuries are likely to be severe, and as a consequence each worker or 
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consumer represents a potentially large damage award. Even a few 
suits can lead to formidable damage awards, and numerous suits can 
easily represent sums that are substantial relative to the assets of even 
large firms. The large size of these potential damage awards creates 
strong incentives to be concerned with liability and to make every 
effort to minimize potential damages. 

Liability, moreover, is designed precisely to create such incentives; 
through efforts to reduce damage payments, firms provide increased 
safety. The problem with latent injuries, however, is that potential 
liabilities are sufficiently large that they can easily come to dwarf the 
assets of even large, profitable corporations (Epstein 1982, p. 8). 
When damages reach such a large scale, firms can easily become more 
concerned with structuring their affairs to minimize exposure rather 
than with improving safety.' l'hese incentives to restructure can then 
undermine the otherwise attractive incentive properties of a liability 
system. 

While large damage awards pose problems, what really sets latent 
hazards apart is the long temporal separation between exposure to a 
dangerous substance and the manifestation of injury. Latent injuries 
remain dormant for long periods, ranging anywhere from 4 to 40 
years and averaging roughly 20 years (Armenien and Lilienfeld 
1974). This latency period creates distinctive enforcement problems 
for traditional liability systems because such systems are injury-based. 
In other words, damages are generally assessed only for actual and 
not for speculative or hypothetical damages. Hence damages are as¬ 
sessed only after an injury manifests itself. ’ 

The key advantage of such traditional tort systems is that they allow 
parties to observe actual injuries before damage payments are made. 
With ordinary injuries, moreover, such a system functions smoothly 
because damages are paid in the ordinary course of business as in¬ 
juries occur during the production, sale, and use of products. 1 

With latent injuries, however, an injury-based system creates a long 
separation between the activities that lead to injury and damage 

3 More formally, when large damage awards lead to a large probability of bank¬ 
ruptcy, liability may lead to either increases or decreases in safety, but it creates an 
unambiguous incentive to avoid payments. See Wiggins and Ringleb (1989) for a for¬ 
mal model and an extensive theoretical analysis of the incentive properties of liability in 
a latent hazard setting. 

* For case law supporting this position, see, e.g., Clutter v. Johns-Manvillc Corp., 646 
F. 2d 1151 (6th Cir, 1981); Uvie v. Thompson, 337 U.S. 163 (1949); Barnes v. A. H. 
Robbins Co., 476 N.E. 2d 84 (Ind. 1984); Raymond v. Eli Lilly & Co., 117 N.N. 16Y, 
371 A. 2d 170 (1977). 

6 With acute injuries it is also possible to use an occurrence form of liability insurance 
so that insurance premiums can be continuously updated as injuries occur. With latent 
hazards, injuries are typically delayed for 20 years, seriously undermining the ability to 
use an occurrence form of liability insurance. 



JOURNAL OF POLITICAL ECONOMY 


578 

awards. This separation creates severe enforcement problems be¬ 
cause it means that damage obligations accrue for many years. As 
time passes, the accumulated damages can become formidable. This 
accumulation is then exacerbated by the severe nature of the associ¬ 
ated injuries. The net result is a large incentive to escape payment. 

Besides just producing an incentive to avoid payments, however, 
this long separation also provides the potential opportunity. The long 
delay between exposure and injury means that agents can engage in 
risky activities and reap the associated returns, while remaining con¬ 
fident that potential damage awards lie many years in the future. 
Hence firms can often produce for substantial periods with relative 
impunity. Then before liability obligations appear, such agents can 
simply go out of business and enter entirely separate lines of work. 
Successful pursuit of such a strategy will leave few assets exposed to 
damage awards when injuries appear. 

A primary way firms implement such a strategy is by shutting down 
or divesting their interest in hazardous portions of their production 
processes and then purchasing the required input from a specialized 
producer (see, e.g.. Stone 1980, p. 71; “Why Small Companies Will 
Survive” [1982, p. 63]; Kraakman 1984, p. 872). Under such an ar¬ 
rangement, the purchasing firm is not liable for injuries associated 
with the manufacture and use of the product made by the specialized 
producer. 1 ’ 

Hence in normal circumstances the responsibility for the safety of 
workers directly associated with the hazardous process is generally 
assumed to be a natural part of a firm’s legal responsibility to direct 
and monitor the production process. The simple rationale behind 
such a rule is that a firm purchasing inputs cannot generally dictate or 
monitor the working conditions and safety precautions of the firm 
from which it purchases inputs; such a requirement would effectively 


b There are several exceptions to this general rule. For example, liability may not be 
avoided if the specialized producer is an employee of the purchasing firm (Keeton 
1984, pp. 501—8), if the specialized producer is an independent contractor for whom 
the purchasing firm has failed to provide sufficient information regarding the hazard 
process to take the necessary precautions (pp. 509-16), or finally if the product pro¬ 
duced by the specialized producer is itself hazardous and the purchasing firm incorpo¬ 
rates the product into its own product and then fails to warn its consumers of the risks 
involved in the use of the product. See, e g.. Michalko v. Cooke Color and Chemical 
Corp.. 91 N.J. 386, A. 2d 179, 181-83 (1982) (the defendant independent contractor 
did not design or sell the product, but only manufactured it; the defendant manufac¬ 
turer was held strictly liable for failing to warn buyers of defects). Several other ar¬ 
rangements are yet to be tested in the courts. For example, firms have established 
Special subsidiaries, spin-offs, product manufacturing by contract, purchase of prod¬ 
ucts from specialized producers, and other similar arrangements in an,apparent effort 
to avoid potential liability (see, e.g., Scredon and Glaberson 1985, pp. 58-59; Roe 
1986). 
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break down arm's-length market exchange. The legal responsibility 
for safety then generally falls on the firm whose workers are exposed 
to the hazardous process. In addition, if the product produced is itself 
hazardous, then the specialized producer will be liable for injuries to 
workers in those firms using the product as an input, if it has failed to 
warn of the risks associated with the product’s use (see, e.g., Borel v. 
Fiberboard Products Corp., 493 F. 2d 1076 [5th Cir. 1973], cert, dented, 
419 U.S. 869 [1974]). 

The primary disadvantage of using divestiture to avoid liability is 
that under such a strategy firms sacrifice potential economies of inter¬ 
nal organization. When these economies are small relative to the po¬ 
tential liability damage awards, however, the small independent cor¬ 
poration becomes the cost-minimizing way to operate in a latent 
liability environment. Hence large firms can often minimize the cost 
of obtaining inputs whose manufacture is risky by purchasing such 
inputs from small corporations with few assets. 

When such implicit divestiture is the cost-minimizing response to 
liability, such a system simply leads to an equilibrium in which small 
corporations handle hazards. These firms then turn over at a high 
rate to avoid damage payments. Under such an equilibrium, liability 
leads neither to substantial incentives for safety nor to appropriate 
compensation for injury. In fact one can show that individual worker 
compensation can easily fall, safety can rise or fall, and more workers 
may well be exposed to hazards as small, labor-intensive firms take 
over production. 7 

Such implicit divestiture is a potentially significant problem because 
in many sectors the most serious latent risks are concentrated in a 
relatively small set of specific tasks. In chemical manufacture, for 
example, the most serious risks are often associated with specialty 
chemicals and processes (see, e.g., “Why Small Companies” [1982, p. 
63]) or occur when potent raw chemicals are combined to create a 
finished or secondary product ( Moore v. Allied Chemical Corp. 480 F. 
Supp. 364 [1979]; Wall Street Journal, October 6, 1976; Goldfarb 
1977). Handling of the chemicals and other products produced by 
these hazardous tasks, both before and after the hazardous activity, is 
then relatively safe. Similarly, the general operation of a nuclear reac¬ 
tor is quite safe and involves minimal health hazards to the relevant 
set of workers. In contrast, the cleaning of the reactor vessel involves 
intense exposure to workers over short periods of time; hazards in 
this task are severe (Wall Street Journal, October 12, 1983). As another 

7 See Wiggins and Ringleb (1989) for formal proofs of these claims and a more 
complete discussion of these theoretical issues. The labor intensity of small firms 
emerges because capital is less valuable to such firms because of the likelihood that 
capital will lose its value to shareholders if it is claimed as part of a damage settlement. 
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example, instrument manufacture involves intensive exposure to 
heavy metals and other carcinogens, but once the instruments are 
completed, ordinary handling is quite safe (Hickey and Kearney 
1977). The implication is that hazardous tasks are often distinct from 
other, safer activities. 

Even when the tasks are difficult to separate, moreover, the large 
scale of potential liability creates significant incentives for firms to 
take extraordinary measures in an attempt to segregate risky activities 
from the assets associated with the safer activities of a large firm. 

The implication of these arguments is that large, integrated firms 
have an incentive to divest risky activities and minimize the exposure 
of assets to potential liability claims. The disadvantage of this arrange¬ 
ment is that firms must forgo economies of internal organization, and 
hazardous tasks must be segregated in small corporations. 

The key issue then becomes the empirical question of how steep the 
trade-off is regarding the cost of vertical divestiture compared with 
the benefit of avoiding liability payments. When there are large cost 
savings from avoiding liability, then divestiture will be common. 
When it is technologically difficult to separate risky tasks, divestiture 
will be rare. When divestiture is common, the implication is that firms 
must believe that there are significant cost savings from reduced ex¬ 
pected liability payments. Hence if divestiture is common, the impli¬ 
cation is that the liability will not generate appropriate incentives for 
safety or compensation to the injured. Instead liability merely leads to 
corporate restructuring, small-firm operation of hazardous tasks, and 
high firm turnover rates. The result is a system with questionable 
incentive properties. The key to the evaluation of the liability system, 
then, is to examine the extent of divestiture. Attention is now turned 
to this empirical trade-off. 


III. Empirical Tests 

A. An Econometric Model 

The key empirical issue is to determine how firms are responding to 
recent changes in the liability system. The issue is to determine 
whether hazards are sufficiently concentrated in particular tasks to 
enable firms to segregate hazards in small corporations on a substan¬ 
tial scale. The natural way to examine divestiture is to examine how 
recent changes in the liability system have affected the structure of 
organizations in risky sectors of the economy. The obvious period to 
examine for these effects is 1967—80, the period surrounding rapid 
changes in liability law. 

Prior to the mid-1960s, the law offered substantial obstacles to 
workers seeking damages for latent injuries. State workers’ compen- 
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sation programs routinely rejected claims for latent injuries, particu¬ 
larly when the latency period was long and the injury seemed similar 
to “the ordinary diseases of life.” Further, the workers’ compensation 
statutes in nearly all states precluded workers from seeking redress 
through the court system under tort law, affording employers strong 
protection from liability suits in the large majority of cases. 8 In those 
circumstances in which employees were able to sue, the evidentiary 
requirements under then existing tort law posed formidable addi¬ 
tional barriers to recovery. 

Beginning in the late 1960s, numerous changes in the legal envi¬ 
ronment caused these obstacles to worker recovery to begin to un¬ 
ravel. The most fundamental changes involved inroads made in 
third-party liability and against the workers’ compensation exclusivity 
rule; these changes allowed workers to more easily seek compensation 
for injuries under tort law. 9 The advent of strict liability in tort in 
latent injury cases also allowed workers suffering from latent injuries 


* All workers’ compensation statutes provide that "the Compensation remedy is ex¬ 
clusive of all other remedies by the employee or his dependents against the employer 
and insurance carrier for the same injury, if the injury falls within the coverage formula 
of the Act" (Larson 1976, sec 65.00), 

9 Recent cases have recognized several exceptions to the exclusivity rule. The most 
obvious exception is that the worker must be in that class of worker subject to workers' 
compensation; employees outside this class may pursue common law remedies (see 
Prosser 1971, p 526). In almost all jurisdictions, an exception to the exclusivity rule is 
recognized lor intentional torts (see Larson (1976. sec. 68,13, p. 13-5] and cases cited in 
his n. 11). A growing number of jurisdictions are beginning to define intentional torts 
more broadly to include those situations in which the employer was knowledgeable 
about the hazardous nature of the employment (see, e g., Boudeloche v. Grow Chemi¬ 
cal Coating Corp.. 728 K. 2d 759 (5th Cir. 1984]) A number of courts have held the 
firm liable for workers’ injuries resulting from the employer’s “willful, wanton, and 
reckless misconduct". Mandolidas v. Irlkins Industries, Inc., 246 S.E. 2d 907 (1978); 
Blankenship v. Cincinnati Milacron Chem. Inc., 433 N.E. 2d 572 (1982); Wade v. 
Johnson Controls, Inc., 693 F. 2d 19 (2d Cir 1982) (Vermont law applied). Workers 
have been allowed to recover in those situations in which the firm fraudulently conceals 
from workers the fact that they are suffering from an occupational disease (see Johns- 
Manville Products Corp. v. Superior Court, 612 F. 2d 948 [1980] [lung cancer]; Del 
Monte v. Unitcast Division of Midland Ross Corp., 411 N.L. 2d 814 [ 1978] [silicosis]). In 
a minority of jurisdictions, workers may sue the employer if the employer has breached 
a duty independent of those imposed on it by virtue of being an employer, so-called 
dual-capacity situations. Dual-capactty liability generally arises in torts involving prod¬ 
uct liability or medical malpractice. In the product liability cases, an injured worker 
maintains an action against the firm based on injuries or illnesses caused by the product 
the firm manufactures if the product is manufactured for sale to the public. See, 
generally, “Dual Capacity Doctrine" (1979). Plaintiffs are allowed to bring suit in sev¬ 
eral jurisdictions if it can be demonstrated that corporate strategy existed for conceal¬ 
ing workplace-related diseases discovered in company examinations (see, e g., Johns- 
Manville Products Corp. v. Contra Costa Superior Court, 27 Cal. 3d 465, 612 P. 2d 948, 
165 Cal. Rptr, 858 [1980]; Millison v, E. I. duPont de Nemours Sc Co., 501 A. 2d 505, 
101 N. J. 161 [1987]). Medical malpractice may apply depending on whether or not the 
physician is an employee of the firm (see Proctor v. Ford Motor Co., 302 N.W. 2d 580 
[1973]) and to whom the benefit of those medical services accrues (physician-patient 
relationship required; see Rogers v. Horvath, 237 N.W. 2d 595 (1975]). See, generally, 
Blum (1978, p. 433). 
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to press tort actions under substantially lower evidentiary require¬ 
ments, which led to increased success (see, e.g., Borel v. Fiberboard 
Products Corp., cited in Sec. II). Damage awards in successful tort cases 
also came to dwarf those found under workers’ compensation, sub¬ 
stantially raising firm costs. At the same time, judicial experiments 
emerged regarding statutes of limitations, causation, the application 
of joint and several liability, and others, all of which further reduced 
barriers to worker recovery. 10 

Concurrently, substantial medical and scientific evidence linking 
occupational exposures to latent disease began to emerge. These 
scientific findings reinforced the impact of judicial changes. Medical 
results ever more closely linked hazardous substances to latent occu¬ 
pational injuries. When combined with changes in firms’ legal respon¬ 
sibilities at tort, these medical changes substantially increased firms’ 
potential liability exposure for latent injuries. Prior to these develop¬ 
ments, firms had little reason to be concerned with possible damage 
awards for latent occupational injuries; afterward expected liabilities 
became a major concern. Hence these changes in legal structure and 
in medical opinion led to large increases in potential liability costs for 
firms operating in hazardous sectors of the economy. 

If divestiture is an important response to liability, then these 
changes in liability should lead to substantial increases in the number 
of small corporations operating in hazardous sectors of the econ¬ 
omy. 11 This entry will occur as large firms cease production associated 
with hazards and then seek outside suppliers that have lower ex¬ 
pected liability costs, an implicit form of divestiture. 12 Since hazard- 

10 See, e.g., Sindell v. Abbott Labs., 26 Cal. 3d 588, (507 P. 2d 024, 163 Cal. Rpr. 132, 
cert. denied, 101 S. Ct. 268 (1980) (liability was assessed on the basis of each firm's market 
share within the industry in a case in which the plaintiff could not identify the responsi¬ 
ble firm). 

11 Divestiture is likely to be in the form of a corporation. According to the general 
rule of corporate limited liability, shareholders are not liable for the torts of the corpo¬ 
ration (see, e.g., Zubik v Zubik, 884 F. 2d 267 [3d Cir. 1967J, cert, denied. 390 U.5. 988 
[1968]). Limited liability will be sustained by the courts as long as the corporate entity is 
not used by its owners to perpetuate a fraud, evade an existing obligation, or circum¬ 
vent the law (Henn 1970, p. 146). If the corporate form of organization is used for such 
purposes, the court will ignore the corporate structure by "piercing the corporate veil" 
and holding the shareholders liable personally. In context, this means that the corpo¬ 
rate form of organization cannot provide an ex post shield from tort liabilities arising 
from worker injuries. Rather, the corporation will need to have been selected and put 
into place by the owners well in advance of any tort claims arising from the use or 
production of their products to avoid personal liability. 

12 This implicit form of divestiture will often be chosen so that the firm can avoid any 
possible legal entanglements with continuing hazardous production. At present a firm 
«4bot generally responsible for hazardous exposures that occur after a subsidiary is 
sold, but as the courts continue to modify rules, prudent firms may choose to shut down 
rather than spin off hazardous units. The reader should note that in either event, 
spinning off or shutting down a unit does not generally shield the firm from liability for 
past activities. 
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ous activities are often a specialized portion of production, moreover, 
the best way to measure these effects is to examine changes in the 
number of small firms. Pure numbers of firms are likely to be more 
sensitive than value-added measures because of the specialized nature 
of the tasks in which exposure to hazards is severe. 

On the other hand, if divestiture does not provide an effective 
shield from liability, increases in liability costs should not affect the 
number of small corporations. Finally, regardless of the effect on 
small firms, changes in liability should not generally affect the num¬ 
ber of medium and large firms because such firms will divest only 
small, hazardous portions of production. 

Accordingly, data were gathered for the number of small corpora¬ 
tions operating in various sectors of the economy for the years 1967 
and 1980. The year 1967 was chosen to precede the rapid changes in 
the liability system, which occurred primarily in the first half of the 
1970s, while 1980 was chosen to enable firms sufficient time to re¬ 
spond to the changes. Small corporations were defined as ones with 
assets of less than $250,000 in 1980 dollars (= $100,000 in 1967 
dollars). Since such small corporations are relatively homogeneous, 
moreover, increased numbers ought also to be closely linked to in¬ 
creased value-added but the number of firms is likely to be more 
sensitive. The data are derived for manufacturing industries using 
Internal Revenue Service (IRS) data. 13 

The primary empirical task is to determine the effect of changing 
liability costs on the rate of entry of these small corporations. Chang¬ 
ing liability costs are created by worker exposures to carcinogens, 
which translated into increased expected liability costs when workers 
were given more ready access to tort actions. Accordingly, one way to 
measure expected liability costs is to measure the exposure of workers 
to carcinogens in the workplace. 

By measuring worker exposure to carcinogens, the analysis mea¬ 
sures the incentive of firms to avoid prospective future liabilities that 
firms would expect if they continued to be associated with hazards. 
Hence worker exposure in, say, the early 1970s measures prospective 

l!l The sources of the data are Statistics of Income (1967), Corporate Income Tax Returns 
(1968), and Corporation Source Book of Statistics of Income (1980). The 1980 end point for 
the period was chosen because it provided ample time for firms to respond to liability 
changes and because of the way the IRS data are reported. The IRS reports brackets of 
$100,000 and $250,000 ($250,000 in 1980 dollars is equal to $100,000 in 1967 dollars). 
Hence by using 1980 as the end point, we had available a constant-dollars measure of 
the number of firms at the end of the period. The beginning date was chosen to 
predate changes in liability and to correspond with the publication of a Census of 
Manufactures so that other variables would be available. The data sources did not pro¬ 
vide data for tobacco manufacturers because of confidentiality constraints, and so this 
industry had to be excluded from the data set. 
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future liabilities that would become apparent in the late 1980s or early 
1990s. In contrast, reorganization is unlikely to avoid past liabilities. 
The reason is that reorganization would not generally relieve firms of 
their legal responsibility for injuries to their workers that were al¬ 
ready incurred before the changes in liability rules. 14 Hence the pri¬ 
mary motive for divestiture would be for firms to avoid liability associ¬ 
ated with future activities. 

There are numerous ways worker exposure to carcinogens can be 
measured, but perhaps the best data were developed in a study by 
Hickey and Kearney (1977). In determining the number of workers 
exposed to a given substance, they relied on the 1972-74 National 
Occupational Hazard Survey (National Institute for Occupational 
Safety and Health 1974). These data, collected within the manufac¬ 
turing sector for 86 carcinogenic and suspected carcinogenic chemi¬ 
cals, are presented in table 1. 

These data differ from those in other studies because Hickey and 
Kearney measure the frequency of exposures of individual workers 
to a large number of carcinogenic and suspected carcinogenic agents. 
In contrast, alternative data sources often simply use the volume of 
hazardous substances an industry produces, under the unrealistic as¬ 
sumption that human exposures are directly proportional to produc¬ 
tion. Still other sources use census data that classify all workers in a 
plant as having been exposed, regardless of whether they are directly 
involved with the process in which the carcinogen is used. Both alter¬ 
natives are qualitatively inferior to the Hickey-Kearney methodology, 
which attempts to measure exposures at the worker level. Unfortu¬ 
nately the Hickey and Kearney data are available only at the two-digit 
Standard Industrial Classification level. Accordingly, the econometric 
analysis must be carried out using two-digit data. 

To estimate the relationship between entry of small corporations 
and worker exposure to latent hazards, it is important to develop a 
fully specified model of entry. Let entry by small firms be 

E, = g(H„ X„ e,), (1) 

where E, is the rate of entry of small corporations in industry i be¬ 
tween 1967 and 1980, H, is worker exposure to carcinogens in the 

14 While the rule cited in the text is typical, it is important to note that legal prece¬ 
dents in this area remain unsettled. Accordingly, some firms also seem to be attempting 
to restructure to avoid past liabilities on the off chance that such a strategy will be 
successful. For example, Roe (1986) argues that if the base firm has already marketed 
the product, a subsequent transfer or spin-off should not relieve the base firm of 
liability f<*Wnjury. The liability issue is clouded, however, If, at the time,of transfer or 
spin-off, dangerous product had not yet produced substantial injury or if the 
product development was at that time incomplete. The analysis here will reflect restruc¬ 
turing that is motivated by either of these efforts to avoid liability. 
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TABLE 1 

Frequency of Worker Exposures to Carcinogens and Suspected Carcinogens 
for Two-Digit Industries in the Manufacturing Sector 


Standard Industrial 
Classification 

Industry 

Workei Exposures 
to Carcinogens 

20 

Food and kindred products 

2,874 

21 

Tobacco manufacturers 

13,885 

22 

Textile mills 

2.607 

23 

Apparel 

7.591 

24 

Lumber and wood products 

5.588 

25 

Furniture and fixtures 

4,309 

26 

Paper and allied products 

14,205 

27 

Printing and publishing 

19,625 

28 

Chemical and allied products 

25,984 

29 

Petroleum and coal products 

38,398 

30 

Rubber and plastics 

16,987 

31 

Leather and leather products 


32 

Stone, clay, and glass 

11,985 

33 

Primary metals 

24.367 

34 

Fabricated metals 

90,428 

35 

Machinery (except electrical) 

51,927 

36 

Electrical machinery 

81.533 

37 

Transportation equipment 

53,574 

38 

Instruments 

107,023 


Source —Htckc\ and Kearney (1977 p 29. lahle 7) 


workplace in industry i, X, is a set of other factors affecting entry rates 
of small firms, and e, is a random error term that is assumed to have 
zero mean and finite variance and to be independent of the right- 
hand-side variables. 

If the number of small firms was in long-run equilibrium in 1967 
and if g(-) and X are properly specified, then the coefficient off/, will 
measure the effect of changing liability costs on the rate of entry by 
small corporations. To see why, note that if one begins in a long-run 
equilibrium, systematic changes in the number of small corporations 
can be traced either to changes in ordinary market conditions or to 
firm efforts to evade liability. If proper account is then taken of these 
changing market conditions, the coefficient of H, will measure the 
partial effect of changes in liability costs on the number of small firms. 

More generally, even if the number of small firms was not in long- 
run equilibrium in 1967, the interpretation of the coefficient of Hi will 
still be correct, as long as the source of the disequilibrium was not 
statistically correlated with the independent variables. The goal, lhejT 
is to account for other market factors that might influence entry to 
ensure an unbiased estimate of the effects of exposure to hazards and 
liability. 

Industry growth is the first important structural factor to account 
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for percentage changes in the number of small corporations. If the 
industry began in equilibrium, then an increase in an industry’s value- 
added will generally increase the number of firms of all sizes, includ¬ 
ing small firms. Put differently, with constant market shares for large 
and small firms, growth will increase the number of small firms. To 
account for this influence on the number of small firms, industry 
growth rates in value-added were calculated from Census of Manufac¬ 
tures data for the period 1967—77. 

In addition, however, there may be systematic increases or de¬ 
creases in the market share of small firms over the sample period. 
Such changes could be traced to changes in government policy or 
other economywide factors. Such considerations would generate a 
systematic change in the number of small firms across all markets, 
which can be accounted for by using a constant term in the regres¬ 
sions. 15 Such a term will measure the average percentage change in 
small firms in the economy as a whole. Hence, the growth variable will 
account for changes in a particular industry’s size, which might lead to 
a change in the number of small firms, while the constant term will 
account for systematic economywide influences on the number of 
small firms. 

Other factors might also change the rate of entry of small firms in 
an industry, and previous studies of entry were used to assist in deter¬ 
mining the set of factors to be examined. These studies identify sev¬ 
eral determinants of changes in the number of small firms over 
time. 16 One such factor is energy costs. Energy costs rose sharply 
during the 1967-80 period, and large and small firms differ system¬ 
atically in their energy intensities. Hence, energy cost increases (Tur¬ 
ing the period should bring about increased small-firm entry. Pashi- 
gian (1984) developed a measure of the relative energy costs of large 
and small firms from unpublished Bureau of Census tables. That 
measure is used here. 

In addition, more traditional structural conditions may lead to 
changes in the number of small firms in the period in question. These 


15 In addition, major policy changes were investigated within individual sectors. Dur¬ 
ing the period there were important policy changes in petroleum and coal products 
that lowered the costs of small firms relative to large firms. Most important of these 
policies was the small refiner bias in the fuel entitlement program: refiners using less 
than 175,000 barrels of oil per day were heavily subsidized. In addition, there were 
major policy changes in the tobacco industry that favored small firms. Given the purely 
crbss-sectional nature of the estimating equation, there was no statistical mechanism to 
account for these policies oth^r than a dummy variable, which would have eliminated 
any influence of this observation on the remaining coefficients. Hence, these observa¬ 
tions were omitted. 

16 Some of the more prominent previous studies of entry include Mansfield (1962), 
Peltzman (1965), and Orr (1974). 
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TABLE 2 

Descriptions and Means of Dependent and Independent Variables 


Variable Mean 


Dependent variable: 

Entry rates of small corporations (%) (entry) 39.2 

Independent variables: 

Hickey and Kearney index (expected disease liability) 33,010 

Industry growth rate (%) (growth) 33.09 

Reciprocal of distances goods are shipped (regional market) 384.4 

Relative energy costs of large and small firms (%) (energy) 88.02 

Weighted-average four-digit concentration ratios (%) (CR 4 ) 34.06 

Advertising to sales ratio (%) (AD/sale) 1.0 


Note —The names of variables used in the regression tables are in parentheses 


factors include the level of concentration and the advertising to sales 
ratios. Concentration is measured at the two-digit level by using a 
weighted average of the four-digit concentration levels (see Collins 
and Preston 1968). 17 The advertising to sales ratio is used to measure 
product differentiation as a barrier to small-firm entry (see Comanor 
and Wilson 1967). The latter measure is introduced both directly and 
with a slope dummy that allows for a different relationship between 
advertising and entry in producer and consumer goods industries. 18 

Finally, a regional market variable was also included in the model. 
Since changes in market shares of small firms may differ system¬ 
atically in geographically segmented markets, a measure was included 
to control for such factors. The variable, developed by Weiss (1972), is 
the reciprocal of the distance goods are shipped, as reported in the 
Commodity Transport Surveys of the 1977 Census of Transportation. 
Simple statistics for the variables are reported in table 2. 

B. Estimation and Results 

Table 3 reports six different specifications of the basic estimating 
equation so that the reader can examine the impact of alternative 
specifications. 19 The null hypothesis is that worker exposure to occu- 


17 Averages of the four-firm concentration ratios in 1967, weighted by the value of 
shipments of each industry, were used to estimate concentration level. Data were 
derived from the 1967 Census of Manufactures. This ratio is similar to that employed by 
Collins and Preston (1968). 

18 In constructing the advertising to sales ratio, we used the average of advertising to 
sales ratios from 1972 to 1975. A differential slope for consumer goods was introduced 
by multiplying the original variable times a zero-one dummy, which assumed a value of 
one in consumer goods industries. This is to allow for the possibility that the role 
advertising plays in entry differs between producer and consumer goods industries. 

19 The regressions reported in table 3 are simple additive specifications. In addition, 
a number of double log specifications were run. We report the additive specifications 
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pational hazards will be unrelated to the entry of small firms. This 
hypothesis corresponds to the theoretical proposition that changing 
liability rules did not affect the number of small corporations, which 
means that divestiture provides little protection from latent hazard 
liabilities. The alternative hypothesis is that worker exposure has a 
significant effect on entry by small corporations, implying that corpo¬ 
rate limited liability offers substantial protection from liability costs. 

The statistical results strongly reject the hypothesis that expected 
disease liability, as measured by the exposure of workers to carcino¬ 
gens, has no effect on entry by small corporations. The coefficient of 
the expected liability variable is large and statistically significant in all 
the specifications in table 3. I'he coefficient estimate is also virtually 
independent of the specification of the estimating equation across the 
six specifications reported in table 3. In these estimations, the median 
coefficient of the hazard variable is .060, its standard error is roughly 
.028, and it is uniformly significant at the 5 percent level. The esti¬ 
mated hazard variable coefficients all lie roughly within one-half of 
one standard error of the median coefficient; the estimated relation¬ 
ship between worker exposure to hazards and entry is remarkably 
insensitive to the specification of the basic estimating equation. 
Hence, the data show a robust, statistically significant relationship 
between entry rates and potential liability costs. 

In addition to being highly statistically significant, the results show 
that hazard exposure is a major determinant of small-firm entry. To 
assess its importance, consider equation 1, the most fully specified 
equation in table 3. T he portion of entry that can be attributed to the 
avoidance of liability can be found by multiplying the estimated coef¬ 
ficient 0.060 percent (.00060) times the sample mean of the hazard 
variable (33,010), which yields 19.8 percent. In other words, the in¬ 
centive to evade liability has led to roughly a 20 percent increase in 
the number of small corporations in the U.S. economy. Since there 
was an average 39.2 percent increase in the number of small corpora¬ 
tions over the period, the data show that roughly one-half of the entry 
of small corporations over the period is linked statistically to the expo¬ 
sure of workers to occupational carcinogens. 

In pure numbers, the incentive effects of liability on small-firm 
entry appear to be substantial. The percentage increase corresponds 
to an increase of more than 25,000 small corporations whose entry is 


because there are negative values for some observations, requiring some kind of ad hoc 
adjustment in the log specifications. Some of the log specifications were run omitting 
observations in which one or more values were negative, and others were run adding 
100 units to all the values of a variable before taking logs so that there would be no 
negative values. In all cases the log specifications yielded results qualitatively similar to 
those reported in table 3. 
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tied statistically to changing liability rules. The magnitude of these 
effects suggests substantial efforts by firms to avoid possible liability 
costs associated with latent hazards. 

The findings then suggest that liability has led to a significant trans¬ 
formation in the organization of hazardous production processes. 
Large numbers of small firms are entering hazardous sectors. The 
large magnitude of these effects, moreover, suggests that at least in 
significant sectors the sacrifice of economies of internal organization 
appears to be small relative to liability costs. Hence restructuring ap¬ 
pears to be a path that firms are pursuing in significant numbers in 
efforts to avoid liability. 

These findings raise substantial questions about likely future prob¬ 
lems in enforcement for tort liability in the case of latent hazards. 
Before we examine the policy implications and alternatives, however, 
it is also important to evaluate the basic regression model more fully 
to assess the reliability of the results. 

One way to carry out such an evaluation is to note that in other ways 
the regressions seem to provide plausihle results. As theory would 
predict, for example, the variables measuring small-firm energy cost 
advantages and industry growth seem to be closely related to the rate 
of entry of small firms in an industry. The variable measuring small- 
firm energy cost advantages shows significant increases in small firms 
in industries in which small firms are comparatively less energy inten¬ 
sive than larger ones. Industry growth should also lead to a significant 
percentage increase in the number of small firms, and this hypothesis 
is generally confirmed in the various specifications presented in table 
3. In contrast, the entry barrier variables are not generally significant, 
though the two-digit level of aggregation does not provide a powerful 
test of the entry barrier hypothesis. 

An alternative check on the reliability of the basic equations is to 
replicate the statistical analysis for an earlier period. The empirical 
analysis implicitly presumes that the number of small corporations 
was in long-run equilibrium in 1967. An alternative possibility is that 
instead there has been a long, systematic increase in the number of 
small firms in hazardous sectors; despite large /-values, the data may 
merely reflect a spurious historical relationship. 

To investigate this possibility, data were gathered so that the model 
could be estimated over the period 1957-67. Data could be collected 
for all variables except the hazard variable. Concern with hazards and 
the investigation of exposures in the workplace are relatively recent 
phenomena, and we are unaware of reliable hazard data for this 
earlier jgnpd. On the other hand, it is unlikely that there would be 
muchsfhlnge in hazardous exposures between the Hickey and Kear¬ 
ney study using data from the early 1970s and exposures during the 
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1960s. Concern with hazards emerged only in the early 1970s, and 
levels of hazards were likely determined by standard manufacturing 
practices across industries. Given an absence of concern over hazards 
during the 1960s, it is unlikely that hazard levels changed substan¬ 
tially across industries during this period. Accordingly, the hazard 
variable used was the same as that for the original set of regressions. 

The model was then reestimated to see if there had existed previ¬ 
ously a relationship between small-firm entry and exposure to haz¬ 
ards. These regressions were run for corporations with $100,000 in 
assets, and the results are reported in table 4. 20 Inspection of the table 
shows that there is no evidence of a historical relationship. The signs 
of the coefficients are negative and statistically insignificant. 

The evidence then shows that the strong, positive relationship be¬ 
tween small corporate entry and hazardous exposures emerged only 
after liability laws were changed during the sample period. The 
strong suggestion is that the relationship between entry and hazards is 
directly linked to changes in liability laws. 

Finally, to explain our results, it might be argued that small firms 
are simply lower-cost producers of hazardous products. Such an ex¬ 
planation, however, does not explain why such a cost advantage 
would have emerged in the late 1960s or early 1970s, which is neces¬ 
sary to explain the change in the number of small corporations that 
occurred during that particular period. Hence, there do not appear 
to be simple, competing explanations of the econometric findings. 
The implication is that the empirical evidence suggests a large-scale 
effort by firms to structure hazardous production processes to avoid 
liability! 


IV. Interpretations. Policy Options, and 
Conclusions 

The empirical results indicate that liability has led to a substantial 
restructuring of production in hazardous sectors of the economy. 
The implication of these findings is that expected liability costs seem 
to outweigh the returns to internal organization for hazardous activi¬ 
ties, and as a result small firms are playing an increasingly important 
role in hazardous sectors. This finding suggests that there may be 


ao Recall that in the primary estimations we used J 100,000 in 1967 dollars and 
$250,000 in 1980 dollars as our size boundaries. The low inflation rates of the late 
1950s and early to mid 1960s precluded use of a similar procedure in these regressions. 
The constant term in the regressions, however, will capture any systematic samplewide 
changes in the number of small firms. Hence, changes in the classification of firms due 
to price inflation, which should be (roughly) uniformly distributed throughout the 
sample, will be captured by the constant term in the regression. 
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substantial enforcement problems associated with the use of tort lia¬ 
bility as a solution to problems of latent hazards. 

Accordingly, when one is evaluating the desirability of liability as a 
policy option, it becomes important to consider both the problems 
posed by enforcement and the net incentive effects of corporate re¬ 
structuring. With latent hazards, liability can be difficult to enforce 
because the injury-based nature of claims provides agents with sub¬ 
stantial opportunities to avoid payment during the latency period. 
When firms successfully avoid payment, liability provides neither the 
incentives for care nor the insurance function normally attributed to 
it. Moreover, when liability leads to a preponderance of small firms 
and rapid turnover in hazardous sectors, it undermines the accumula¬ 
tion of experience in dealing with hazards, so that safety does not rise 
with the accumulated lessons of the past. The implication, then, is that 
liability for latent hazards differs substantially from its application in 
more traditional settings. 

There are a number of ways in which the liability system can be 
modified in an effort to deal with these problems, but these too pose 
problems. One obvious solution is to assess damages earlier to prevent 
firms from avoiding damages. The difficulty with this solution is that 
it subverts the key distinguishing feature of liability. The key advan¬ 
tage of liability is to permit ex post compensation after injuries can be 
commonly observed. Imposing liability earlier would mean that in¬ 
juries could not be observed, and so a liability system either degener¬ 
ates into a simple lump-sum transfer to all workers or acts like a labor 
tax that' is rebated to workers. Hence liability’s essential feature of 
linking payments to ex post damages is lost, as is its implicit insurance 
function. 

Another option is to attempt to apportion damages according to 
market share, which has been attempted in the DES cases (see also 
Sindell v. Abbott Labs., cited in n. 10). The problem with this solution, 
however, is that payments are not conditioned on the individual firm’s 
safety activities, and so it becomes essentially a form of social insur¬ 
ance. Finally, there have been numerous attempts in the asbestos 
cases to link insurance payments to the degi%e of worker exposure 
over time. This scheme, however, also implicitly permits liability obli¬ 
gations to accrue over many years, creating incentives to avoid liability 
similar to those found here. 

The implication of these findings, then, is that liability for latent 
hazards differs substantially from other forms of liability because of 
the potentially significant problems posed by enforcement. Moreover, 
there does not appear to be a simple remedy to these enforcement 
problems that preserves liability’s traditional incentive characteristics. * 
The general implication, though, is that enforcement problems are a 
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fundamental concern and need to be considered carefully in continu¬ 
ing efforts to deal with the latent hazard problem. 
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A Nonparametric Investigation of Duration 
Dependence in the American Business Cycle 


Francis X. Diebold 

University of Pennsylvania 


Glenn D. Rudebusch 
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Does the termination probability of a business expansion or contrac¬ 
tion increase with age? This question may be formally addressed by 
analyzing the nature of duration dependence in aggregate economic 
activity. Our null hypothesis is that there is no duration dependence, 
which we test via intentionally nonparametric procedures. We also 
argue that a common notion of business cycle periodicity can be 
usefully interpreted in terms of whole-cycle duration dependence. 
We find some evidence for duration dependence in whole cycles and 
in prewar expansions, but little evidence elsewhere. 


I. Introduction 

Several authors have recently modeled the business cycle as the out¬ 
come of a Markov process that switches between two discrete states, 
with one of the states representing expansions and the other repre¬ 
senting recessions. However, very different specifications have been 
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adopted for the transition probability matrix governing the move¬ 
ment of the economy between these two states. For example, Neftci 
(1982) assumed that the transition probabilities were duration depen¬ 
dent; in particular, he assumed that the longer the economy re¬ 
mained in one state, the more likely it was to change to the other. 1 In 
contrast, Hamilton (1989) assumed that the state transition probabili¬ 
ties were duration independent so that, for example, after a long 
expansion (i.e., a long time in the expansion state), the economy was 
no more likely to switch to the recession state than after a short expan¬ 
sion. 2 

To resolve the question of the duration dependence of expansions 
and contractions, we investigate the nature of the probability process 
that generates their lengths. In addition, we consider the evidence for 
duration dependence in the lengths of whole cycles measured from 
peak to peak and from trough to trough. Whole-cycle duration de¬ 
pendence is obviously related to the question of half-cycle duration 
dependence, but it can also be interpreted in terms of a weak defini¬ 
tion of stochastic periodicity, namely, that business cycle lengths tend 
to cluster around a certain duration. We argue that this notion of 
periodicity was implicit in an earlier literature on business cycles. For 
example, the classical “8-year” business cycle was distinguished as a 
cycle by its tendency to endure 8 years. 

Of interest, of course, is the significance of the tendency of business 
fluctuations to maintain a fixed cyclical length. Early on, Irving Fisher 
(1925) argued that business cycles had no such tendency, but that 
instead they resembled “Monte Carlo cycles,” the phantom cycles of 
luck perqeived by gamblers at a casino. Similarly, to a casual observer 
of a repeated coin toss, runs of consecutive heads or tails may appear 
more likely to end as they grow longer, but the termination probabil¬ 
ity of a run actually remains constant. As Fisher would argue, one 
may tabulate the number of consecutive heads in repeated trials and 
find the average length of these runs, but there is no intrinsic cluster¬ 
ing of run lengths, or periodicity, in the process. It is precisely this 
interpretation of weak business cycle periodicity that we shall test as 
our null hypothesis. 

In Section II, we explore more fully the notion of duration depen¬ 
dence in a macroeconomic context. In Section III, we provide a weak 
definition of periodicity that will be useful in interpreting the dura¬ 
tion dependence of whole cycles. Section IV describes our empirical 
methodology, which employs nonparametric tests for duration de- 


' This view has been expressed often in the popular press, e.g., with the suggestion 
that a very long expansion is unstable and is unusually likely to end. 

3 This is also the assumption of Diebold and Rudebusch (1989fr), 
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pendence. These tests are based on the conformity of the lengths of 
half cycles and whole cycles to the exponential distribution, which 
corresponds to an absence of duration dependence. Empirical results 
are presented in Section V, and Section VI concludes with an inter¬ 
pretation in the light of recent developments in macroeconomics. 


II. Macroeconomic Duration Dependence 

A large statistical and econometric literature has addressed the inter¬ 
pretation of duration data. 5 A basic element of this analysis is the 
hazard function, denoted here as X(t), which is the conditional proba¬ 
bility that a process will end after a duration of length t, given that 
it has not terminated earlier. For example, microeconomic data indi¬ 
cate that lengths of employment for individuals exhibit a decreasing 
hazard function ( d\(7)/dj < 0) or negative duration dependence; that 
is, the longer a job is held, the less likely it is to be lost. This section 
presents some aspects of duration analysis that are relevant for mac¬ 
roeconomics. 

Two examples of hazard functions are shown in figure 1. The con¬ 
stant hazard function, Xi(ij = X (dashed line), reflects a termination 
probability with no duration dependence. The linearly increasing 
hazard function, X 2 (t) = ^ (solid line), reflects a termination proba¬ 
bility with positive duration dependence, so that termination proba¬ 
bility increases with time. The question of the appropriate specifi¬ 
cation of a Markov model of the business cycle can be reduced to 
determining whether expansions and contractions are governed by a 
constant hazard, as assumed by Hamilton (1989), or by a nonconstant 
hazard such as X 2 (t), as assumed by Neftci (1982). 4 

A given hazard function, X(t), provides a complete characterization 
of the unconditional density of durations, /(t), since 


/(t) = X(t) exp: 


h; 


K(u)du 


(D 


Figure 2 displays the duration densities associated with the hazard 
functions given in figure 1. The constant hazard implies an exponen¬ 
tial density of durations (dashed line), 5 

f\ (t) = X exp(-Xr), t > 0. (2) 


* This literature is well surveyed in Kiefer (1988). 

* As a related issue, Neftci (1984) investigates whether the hazard rates of expansions 
and contractions are the same, i.e., whether the state transition matrix and hence 
business cycles are symmetric. In contrast to his earlier work, Neftci performs the 
analysis under an assumptirttf of time-invariant transition probabilities. 

5 In discrete time, the corresponding probability distribution is geometric, /(t) = 
(1 - k) T_ 'k, t = 1, 2, S, . . . , which has the obvious coin toss interpretation of Fisher. 
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1 ' 

Termination 

Probability 



Thus given a constant probability X of termination, the density of 
durations is monotonically declining. Alternatively, the linearly 
upward-sloping hazard implies a particular nonexponential density 
of durations (solid line), 

h( t) = yr exp^— T 2 j, TiO. (3) 

This density is nonmonotonic and unimodal, and there is a clear 
concentration of probability mass around the modal value. 


Prbbabllity 



Fig. 2. —Duration distributions associated with increasing and constant hazards 
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The specific distribution of durations corresponding to a noncon¬ 
stant hazard will of course depend on that hazard’s particular form. 
In general, however, the probability mass associated with a hazard 
displaying positive duration dependence is more concentrated 
around its mean than that associated with the exponential distribution 
of the same mean .* 5 This is an implication of the turning point proba¬ 
bility’s rising with duration. Consider, for example, the increasing 
hazard, X 2 (t) = yr, which implies a duration density with mean E( t) = 
(ir/2y ) !/2 and variance var(T) = (4 - ir)/2y. Note that d\(i)ldi is posi¬ 
tive and increasing in y, while var(T) is decreasing in y. That is, as the 
amount of positive duration dependence increases, the variance of 
the durations decreases. In addition, the exponential density with an 
identical mean has a larger variance since the exponential density 
with mean duration ( 11 / 27) 1/2 fi as variance ir/ 2 y, which is of course 
greater than (4 — ir)/2y for all positive 7 . 

To summarize, a constant hazard implies an exponential distri¬ 
bution of durations. Thus an exponential distribution of historical 
lengths of expansions and contractions is precisely the null hypothesis 
implicit in Fisher (1925) and Hamilton (1989), and it is the one that 
we shall test below. Furthermore, the positive duration dependence 
of an increasing hazard induces duration “clustering” around the 
mean duration, relative to the constant-hazard case. As we describe in 
the next section, for durations of whole cycles, this clustering has a 
natural interpretation. 

III. Business Cycle Periodicity 

In this section, whole-cycle positive duration dependence is related to 
a weak definition of periodicity, an interpretation that provides intui¬ 
tive content to the former and empirical content to the latter. To both 
motivate and clarify our discussion, we shall elucidate several differ¬ 
ent forms of periodicity, including deterministic and stochastic and 
strong and weak. 

We shall say that a variable X, displays deterministic strong periodicity 
of period T ifX, + T — Xt> f° r all t. 1 This type of periodicity is found in 
many early macroeconomic models, such as the multiplier-accelerator 

This general proposition can be proved, as suggested to us by Martin Wells, by 
noting the strict concavity of the log survivor function, log[l - F(t)], when X(t) is 
strictly increasing, Marshall and Olkin (1979, p. 494) show that this concavity implies 
that the rth moments about the origin, tv, are concave in logs when normalized by r 
factorial (rl). In particular, log(p.,) > Vs log(g u ) + Vs log(p 2 /2). After rearrangement, 
this implwalhat var(T) is less than (£(t)] 2 , which is equal to the variance of the exponen¬ 
tial disyyroution with meatf'E(r). 

7 l jBW cfinition and the ones that follow abstract from considerations of growth; we 
also UMMfgard trivia! cases such as a constant X, = k. 
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and inventory systems of Samuelson (1939) and Metzler (1947). Sam- 
uelson’s well-known analysis, for example, uses a multiplier-accelera¬ 
tor system to derive a deterministic second-order difference equation 
for aggregate output. Over a certain range of parameters, this equa¬ 
tion produces stable deterministic cycles with a constant period of the 
type shown in figure 3. 

A stochastic framework provides a more realistic basis for analysis 
of periodicity in economics. The definition of stochastic strong periodicity 
of period T is a straightforward generalization that replaces the equal¬ 
ity of X, and X,+ r with a high correlation between these values for all t. 
Such periodicity has a more precise frequency domain definition as a 
peak in the spectral density at the frequency corresponding to period 
T. Frisch (1933) demonstrated that a structural propagation mecha¬ 
nism can convert uncorrelated stochastic impulses into cyclical output 
with stochastic strong periodicity. This idea of a stochastic, periodic 
cycle obtained from a perturbed macroeconomic system was the foun¬ 
dation for large-scale macroeconometric models (see, e.g., Klein 
1983). However, there has been little empirical support for stochastic 
strong periodicity in economic fluctuations. Perhaps the most influen¬ 
tial evidence against such periodicity is provided by the spectra of 
macroeconomic variables, which are typically monotonically declining 
from low to high frequencies (except at seasonal frequencies) with 
little power concentration at business cycle frequencies (e.g., Granger 
1966; Sargent 1987, chap. 11 ). 8 

s This evidence should be interpreted with caution, however, given the small samples 
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Fic. 4.—Deterministic peak-to-peak weak periodicity 


We shall attempt to assess the evidence for a weaker form of pe¬ 
riodicity. The essential feature of a strongly periodic process is the 
close relationship between X, and X , +T for all t. For the irregular cycles 
of business activity, weaker forms of periodicity, which depend on 
periodic repetition for only certain t, are useful. For example, we de¬ 
fine deterministic peak-to-peak weak periodicity (of period T) to exist for a 
series if for every t such that X, is a peak in the series, X l+T 'is also a 
peak. 9 This is shown in figure 4 with a series that has uniformly 
spaced cyclical peaks but is not periodic at every point in the cytle as 
in figure 3. In particular, note that this series does not exhibit deter¬ 
ministic trough-to-trough weak periodicity, which is exhibited when a 
trough at time t is always followed by a trough at time t + T . 10 

involved and the sensitivity of the results to various types of trend adjustment. Further¬ 
more, spectral methods are intrinsically linear and are not compatible with the Markov 
framework of Neftci (1982) and Hamilton (1989) (see also Neftci 1986; Dicbold and 
Rudebusch 19894). 

9 This definition can be formalized with a function, TP(-), that signals turning points. 
Specifically, if Y, = TP(X,), then is a sequence that is always zero except at a peak in X„ 
when Y, ■ 1 , and at a trough in X„ when Y, = - 1. The series X, displays deterministic 
peak-to-peak weak periodicity if, for each t such that Y, = 1 , Y, + T - 1. 

10 Clearly, strong periodicity implies weak periodicity but not conversely; however, 
the two definitions of periodicity can be closely linked by a time deformation. Stock 
(1987) argues that macroeconomic variables appear to evolve on an economic time scale 
that may speed up or slow down relative to the observed calendar time scale, In such a 
setting, a cyclical process tjjat is strongly periodic in economic time would be distorted 
by the nonlinear lime deformation into a nonperiodic process in calendar time. How¬ 
ever, if the speeding up and slowing down of economic time relative to calendar time 
averaged out over the cycle, the process would still display weak periodicity in the 
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The concept of weak periodicity can be extended to a stochastic 
framework. A series displays stochastic peak-to-peak weak periodicity 
(of period T) if for every X, that is a peak in the series, X, +T is also a 
peak, where t is a random variable with mean T and variance <r 2 . 11 
Stochastic peak-to-peak weak periodicity implies that there is a tight 
distribution of observed peak-to-peak cycle durations (t) around the 
mean period; that is, tr 2 is small. Deterministic peak-to-peak period¬ 
icity emerges, of course, when cr 2 = 0. More generally, however, a 
natural metric with which to evaluate the size of cr 2 , and hence the 
extent of periodicity, is provided by the exponential distribution. Re¬ 
call from the last section the close relationship between positive dura¬ 
tion dependence and duration clustering relative to an exponential 
distribution. In particular, if the durations of cycles from peak to 
peak are clustered around a period of 4 years, then a 2-year-old cycle 
is less likely to end (i.e., more likely to survive 2 more years) and a 6- 
year-old cycle is more likely to end (i.e., less likely to survive even 
longer than 4 years) than a 4-year-old cycle. Thus, for periodic cycles, 
the probability of a peak is increasing with the length of the ongoing 
cycle. Nonperiodic cycles, on the other hand, have no particular inter¬ 
val after which they are more likely to end; their turning points are 
not positively related to the age of the cycle. In this sense, the expo¬ 
nential distribution provides a metric for the extent of periodicity; it 
allows one to ask whether the distribution of actual business cycle 
durations is more closely clustered than would be expected from a 
constant hazard probability model with the same mean duration. 

The stochastic weak form of periodicity, defined in terms of a clus¬ 
tering tendency of intervals between turning points, has been used 
implicitly in many previous discussions of business fluctuations. For 
example, Matthews (1959, p. 216), in a chapter on business cycle 
periodicity, implicitly adopts this definition when describing the path 
of British investment: “Apart from the minor wobbles in the curve 
around 1877 and 1902, the durations of the cycles measured from 
trough to trough are 6, 8, 10, 5 years; measured from peak to peak 
they are 9, 7, 10, 7 years. This is not precisely a seven to ten-year cycle, 
but it is as near to it as anyone could reasonably expect.” The data he 
presents are suggestive of a clustering of cycle lengths, that is, weak 
periodicity. 12 We shall examine more rigorously the empirical distri- 


constancy of peak-to-peak or trough-to-trough durations. Indeed, fig. 4 is generated by 
applying precisely such a time deformation to fig. S. 

1 The stochastic form of weak trough-to-trough periodicity is similarly straightfor¬ 
ward. 

1! For other examples, see Adelman and Adelman (1959, p. 614) (who note approv¬ 
ingly the equivalence of peak-to-peak and trough-to-trough durations in the Klein- 
Goldberger model and in historical cycles), Zarnowitz (1985, pp. 525-26), and Britton 
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budons of durations of whole cycles and half cycles with procedures 
detailed in the next section. 


IV. Nonparametric Tests for Duration 
Dependence 

We use nonparametric methods to directly test observed durations 
for conformity to the exponential distribution. Our analysis is inten¬ 
tionally nonparametric since we do not estimate and test a particular 
hazard model. The imposition of incorrect parametric forms can dis¬ 
tort the available departures from the null hypothesis, and it is now 
well known that incorrect parameterizations of the hazard function 
can lead to severely misleading inferences (see, e.g., Heckman and 
Singer 1984). 

A description of our testing methodology first requires discussion 
of the data. The lengths of expansions, contractions, and whole cycles 
are derived from business cycle turning dates since 1854, as desig¬ 
nated by the National Bureau of Economic Research (NBER). These 
durations (in months) are given in table 1 and provide the raw data 
for our analysis. 13 By definition, a cycle is designated in the NBER 
methodology only if it has achieved a certain maturity. Burns and 
Mitchell (1946, pp. 57-58) describe this criterion: “We do not recog¬ 
nize a rise and fall as a specific cycle unless its duration is at least 
fifteen months, whether measured from peak to peak or trough to 
trough. Fluctuations lasting less than two years are scrutinized with 
special care.” Forty years later, Moore and Zarnowitz (1986), in a 
survey of the NBER methodology, reaffirm this maturity criterion. 
They indicate that full cycles of less than 1 year in duration and 
contractions of less than 6 months would be very unlikely to qualify 
for selection. 

Previous examinations of macroeconomic duration dependence, 
including McCulloch (1975), Savin (1977), and de Leeuw (1987), also 
have recognized this maturity criterion. However, these earlier stud- 


(1986, p. 3). The last of these, which is devoted exclusively to an examination of 
business cycle periodicity, states that "this 'central tendency’ [of cyclical durations] 
is another way of describing the phenomenon with which the present study is con¬ 
cerned.” 

15 In our samples that include postwar expansions, there is a right-censoring problem 
associated with the current expansion. We have assumed (hat this last duration is 80 
months instead of its unknown, but longer, true length. This affects the durauons of 
the last expansion, the last peak-io-peak cycle, and. with the additional assumption of a 
following 9-month contraction, the last trough-to-trough cycle. Since the current ex¬ 
pansion is already quite long by historical standards, any additional length would shift 
the results slightly in the direction of no duration dependence. All bur results are 
robust to varying the length of this final expansion over a wide range. 
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ies are limited in two major respects. First, previous analyses examine 
only the durations of expansions and contractions, but not whole 
cycles; thus the evidence provided is incomplete. 14 Second, the earlier 
work obtains evidence on duration dependence from the goodness of 
fit of estimated sample histograms to a null constant hazard distribu¬ 
tion. The power of these tests has been questioned in small sample 
sizes (see Sichel 1989), and the results obtained with this method often 
depend on the arbitrary number of cells used in histogram construc¬ 
tion (see the sensitivity analysis of Diebold and Rudebusch [1990]). 

We shall apply nonparametric tests that have greater power and do 
not involve the arbitrary factors involved with histogram construc¬ 
tion. Rather than grouping observations into histogram bins and 
thereby discarding information, these tests compare the observations 
with their ordered rank. The null hypothesis is 

Ho : /( t) = X exp[ — X(t - t 0 )], t S: to. X unknown, I 0 unknown. 

(4) 


That is, the duration random variable t has an exponential probabil¬ 
ity density function, where X has the earlier interpretation as the 
constant hazard and to is the unknown minimum possible duration 
from the NBER maturity criterion, which will differ for expansions, 
contractions, and whole cycles. Shapiro and Wilk (1972) extended 
their well-known test for normality to provide a similar test for the 
exponential null // 0 . Renumber the durations in ascending order, so 
that < x 2 s . . . < x N ; then 


W = 


(* - Xjf 

(N - 1 )d 2 ’ 


(5) 


where x = 2 / 7 = i xJN and <r 2 = 27 -1 (x, — x) 2 IN. The W statistic is a 
scaled ratio of the squared difference between the mean and shortest 
duration to the sample variance. The distribution of W is invariant to 
the true values of X and to, and its exact finite-sample critical values 
have been tabulated by Shapiro and Wilk for N ranging from three to 
100 . 

Also relevant to our investigation is a modified W statistic devel¬ 
oped by Stephens (1978) for testing exponentiality conditional on an 
assumed known minimum duration, /« = y, so that the null hy¬ 
pothesis becomes 


Ho. /( t) = X exp[-X(T - y)J, t s y, X unknown, y known. (6) 


14 Expansion and contraction durations individually could show no duration depen¬ 
dence but be negatively correlated during the cycle so as to induce duration depen¬ 
dence in whole cycles. 
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Define A = i (x, - 7 ) and B = X%. 1 (x, - 7 ) 2 . Then the new 
statistic, denoted W(t 0 = 7 ), is given by 


W(t 0 = 7 ) 


A 2 

N[(N + l)B - A *J' 


(7) 


The statistic lT(<o = 7 ) has the same distribution for a sample of size N 
as the W statistic has for a sample of size N + 1 , so the same table of 
finite-sample critical values can be used. Both of these statistics allow 
for the absence of short durations, but the W statistic incorporates a 
true but unknown t 0 value into the null hypothesis, while W(f 0 = 7 ) 
conditions on an assumed t 0 value. The VV (£ 0 = 7 ) test is useful given 
information about the NBER maturity criterion and the likely range 
of the minimum allowable duration t 0 ; furthermore, a sensitivity anal¬ 
ysis that varies to is readily performed . 15 

Finally, we examine another class of nonparametric tests for the 
exponential distribution. Consider first the null hypothesis Ho and 
define the normalized spacings between the ordered durations as 

Y, = (N - i + l)(x, - i-2. N. ( 8 ) 


A plot of Y, versus i provides a mirror image of the plot of the hazard 
function ; that is, increasing spacings imply a decreasing hazard func¬ 
tion. Thus in a regression of normalized spacings on order, namely, 
Y, - a + pi, the exponential hypothesis implies that p = 0. Brain 
and Shapiro (1983) exploit this result to obtain a test statistic for expo- 
nentiality, denoted Z. Let i and Y, denote the “de-meaned” variables 
i - (M2) and Y, - Y. Then 


N — 1 




N — I 

I* 

1=1 L 1-1 


/N(N - 1) 


1/2 


(9) 


The distribution of the Z statistic is asymptotically N(0, 1), which it 
quickly approaches even in quite small samples. Furthermore, an as¬ 
sumed known minimum duration to = 7 also can be conditioned on 
with the Z statistic to test null hypothesis H Simply consider 7 as 
an additional observation and include as the first weighted spacing 
Y 2 = N(xi - 7 ) in the calculations in equation (9) (running the itera- 


15 A clear trade-off emerges between W and W(/ 0 = 7). If the conditioning informa¬ 
tion employed in the latter is correct, it is expected to have higher power; if it is 
incorrect, nominal and empirical size will diverge. Since the validity of a chosen t 0 value 
cannot be ascertained exactly a priori in our application, the W and H'(«o = 7) tests are 
useful in conjunction. 
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tion from one to N). The modified test statistic is denoted Z(t 0 = 7 ). 
Brain and Shapiro also provide an alternative statistic, denoted Z*. 
that is intended to be more sensitive to alternative duration distribu¬ 
tions associated with nonlinear hazard functions. 16 The statistic Z* is 
constructed from a linear regression and a quadratic regression of Y, 
On order and has an asymptotic chi-squared distribution that appears, 
from the simulation study in Brain and Shapiro, to be appropriate 
even in small samples. 

A number of Monte Carlo studies have examined the power of the 
W and Z tests against various alternatives, including the Weibull, chi- 
squared, half-normal, and lognormal distributions. 17 Overall, the W 
and Z tests appear to be comparable in their ability to detect depar¬ 
tures from exponentiality, with small comparative advantages for one 
or the other against specific alternative distributions. Both appear to 
have excellent power in the range of small sample sizes relevant for 
our analysis. 


V. Empirical Results 

Besides performing the constant hazard tests on the full samples of 
expansions, contractions, and peak-to-peak and trough-to-trough cy¬ 
cles, we also examine a variety of subsamples. These include only 
pre— or post—World War II observations and may exclude wartime 
expansions and the whole cycles that contain them. The various dura¬ 
tion samples investigated are listed in table 2 with their associated 
sample size, mean duration, and standard error. 16 The variation in 
the standard error, one measure of dispersion, anticipates some of 
our later statistical results, which will also account for sample-size, 
mean duration, and minimum duration. 

Our study of various subsamples is an attempt to control for possi¬ 
ble heterogeneity across cycles. We are interested in duration depen¬ 
dence induced by economic behavior, and the chosen sample should 
reflect intrinsic macroeconomic forces rather than special factors. 
That is, the systematic mechanism of business cycles, which are prop- 

19 For example, with a hump-shaped hazard function, the slope of the fitted linear 
regression line, Y, = a + pi, may be close to zero. Thus the Z and 2(t„ = y) statistics, 
which are based on this slope, may not be sensitive to such alternatives. 

17 Besides power studies in the papers cited above by Shapiro, Wilk, Brain, and 
Stephens, there are also relevant results in Samanta and Schwarz (1988). 

18 It can be argued that the success of macroeconomics and macroeconomic policy 

has been the halving of the mean duration of contractions in the postwar period. This 
point is different from the one Baily (1978) made about diminished postwar ampli¬ 
tudes, which was disputed, by Rorner (1989) but reaffirmed bv Balke and Gordon 
(1989). \ 



DURATION DEPENDENCE 


609 


TABLE 2 

Business Cycle, Expansion, and Contraction Samples 


Sample 

Sample 

Size 

Mean 

Duration 

Standard 

Error 

Expansions: 

El. Entire sample 

31 

34.6 

21.8 

E2. Entire sample, excluding wars 

26 

28.9 

15.3 

E3. Post-WWII 

9 

48.6 

28.9 

E4. Post-WWII, excluding wars 

7 

40.9 

22.3 

E5. Pre-WWII 

21 

26.5 

10.7 

E 6 . Pre-WWII, excluding wars 

19 

24.5 

9.2 

Contractions: 

Cl. Entire sample 

31) 

18.1 

12.5 

C2. Post-WWII 

9 

10.7 

3.4 

C3 Pre-WWII 

21 

21.2 

13.6 

Peak to peak: 

PP1. Entire sample 

30 

52.8 

24.9 

PP2. Entire sample, excluding wars 

25 

47.9 

22.0 

PP3. Post-WWII 

9 

59.2 

31.0 

PP4. Post-WWII. excluding wars 

7 

51.6 

26.0 

PP5. Pre-WWII 

20 

47.9 

20.3 

PP 6 . Pre-WWII, excluding wars 

18 

46.6 

20.9 

Trough to trough: 

TT1. Entire sample 

31 

52 3 

22.1 

TT2. Entire sample, excluding wars 

26 

47.4 

17.8 

TT3. Post-WWII 

9 

59.0 

27.5 

TT4. Post-WWII, excluding wars 

7 

51.3 

19.3 

TT5. Pre-WWII 

21 

47.7 

18.1 

TT 6 . Pre-WWII, excluding wars 

19 

45.9 

17.6 


erly considered a modern phenomenon of market economies, should 
be distinguished from accidental and episodic crises associated with 
wars, bad harvests, and foreign manipulation of oil prices. 19 Although 
one can always find circumstances specific to each cycle, to the extent 
that all business cycles are alike in their essentials, any intrinsic dura¬ 
tion dependence should be evident. In the absence of any clear infor¬ 
mation on the size or direction of the bias associated with large, 
episodic exogenous shocks, we have some preference for complete 
samples. 20 


19 See Burns and Mitchell (1946, chap. 1). For an evaluation of the role that such 
shocks have played in directing the path of U.S. economic fluctuations, see Blanchard 
and Watson (1986). 

20 Large, exogenous shocks may bias the evidence for weak economic periodicity in 
either direction. For example, the coincidence of two oil price shocks in 1974 and 1979 
or the existence of a quadrennial political business cycle may spuriously strengthen the 
evidence. (See Britton [ 1986, chap. 6 ] for a discussion.) 
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TABLE 3 

IV and W(f 0 = 7 ) Tests for Exponentiauty 
(p-Values under the Null of No Duration Dependence) 




Statistic 


Sample 

mb = 8) 

mb = 9 ) 

mb = 10 ) 

IV 

Expansions: 

El 

.360 

.533 

.699 

.573 

E2 

.113 

.194 

.410 

.211 

E3 

.512 

.573 

.633 

.420 

E4 

.524 

.592 

,660 

.310 

E5 

<.01 

.019 

.044 

.015 

E6 

<.01 

.018 

.043 

<.01 


W(b = 4) 

W(b = 5) 

mb = 6) 

IV 

Contractions: 

Cl 

.725 

.990 

.672 

.810 

C2 

.044 

.150 

.580 

.188 

C3 

.436 

.548 

.859 

.904 


W«„ = IS) 

S' 

I 

W(b = 17) 

W 

Peak to peak: 

PP1 

<.01 

.016 

.037 

.017 

PP2 

.015 

.039 

.085 

.042 

PP3 

.351 

.467 

.581 

.250 

PP4 

.509 

.625 

.741 

,317 

PP5 

.010 

.028 

.057 

.021 

PP6 

.040 

.079 

.151 

.073 


CO 

II 

if 

5 

'1 

5 

mb = i7) 

IV 

Trough to trough: 
TT1 

<.01 

<.01 

<.01 

.698 

TT2 

<.01 

<.0J 

<.01 

.748 

TT3 

.136 

.182 

.280 

.735 

TT4 

.086 

.117 

.162 

.523 

TT5 

<.01 

<.01 

.010 

.778 

TT6 

<.01 

<.01 

.026 

.971 


Note. —These finite-sample Rvalues are obtained by linearly interpolating the tables in Shapiro and Wilk (1972) 
The samples are identified In table 2. 


Probability values for the test statistics are given in tables 3 and 4. 
These p-values represent the likelihood of obtaining the value of the 
test statistic actually obtained under the null of no duration depen¬ 
dence . 21 Small ^-values therefore indicate significant departures from 
exponentially. We generally prefer the third column of each table, 
that is, the W(to = 7 ) and Z(t (1 = y) tests, which assume a minimum 


21 The tests employed require that the observations are independent. In fact the 
correlations between successive durations in table 1 are quite low and are not statisti¬ 
cally significant at even the 20 percent level. 
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TABLE 4 > 

Z, Z*, and Z(t 0 = -y) Tests for Exponentiauty 
(p-Values under the Null of No Duration Dependence) 




Statistic 



Sample 

Z(t 0 — 8) 

Z(t u = 9) 

Z(^) — 10) 

Z 

Z* 

Expansions: 

El 

.383 

.574 

.818 

.579 

.077 

E2 

.165 

.292 

.491 

.291 

.028 

E3 

.705 

.781 

.862 

.547 

.320 

E4 

.701 

.780 

.866 

.488 

.382 

E5 

.021 

.043 

.090 

.033 

.008 

E6 

.022 

.047 

.099 

.034 

<■005 


Z(t 0 = 4) 

eT) 

II 

N 

Z(t u = 6) 

Z 

2• 

Contractions: 

Cl 

.587 

.952 

.453 

.662 

.052 

C2 

.096 

.252 

.672 

.268 

.351 

C3 

.393 

.633 

.956 

.974 

.067 


Si 

II 

Z«o = 15) 

II 

N 

Z 

Z* 

Peak to peak: 

ppi 

.011 

.027 

.067 

.028 

<.005 

PP2 

.018 

.043 

.103 

.043 

<.005 

PP3 

.535 

.653 

.792 

.406 

.357 

PP4 

.677 

.8)4 

.972 

.487 

.431 

PP5 

.018 

.038 

.080 

.028 

<.005 

PP6 

.045 

.087 

.169 

.064 

<.005 


Si 

11 

03 

N 

o' 

II 

u* 

Z(h = IV) 

Z 

Z* 

Trough to trough: 

TT1 ' 

<.005 

<.005 

.009 

.964 

.633 

TT2 

<.005 

<.005 

.006 

.960 

.416 

TT3 

.294 

,371 

.467 

.921 

.502 

TT4 

.210 

.269 

.346 

.713 

.709 

TT5 

<.005 

.008 

.018 

.943 

.162 

TT6 

.006 

.014 

.031 

.838 

.050 


Notf —The />-valuei are obtained using ihe asymptotic distributions of the Z and Z(<o ■ y) statistics, which are 
N( 0, 1), and of the Z* statistic, which is x 2 with two degrees of freedom The samples are identified in table 2. 


duration equal to the shortest observed duration (i.e., 17 months for 
cycles, 10 for expansions, and 6 for contractions). The first two col¬ 
umns in each table check the robustness of the results with smaller / 0 
values, while the IV, Z, and Z* columns do not incorporate informa¬ 
tion regarding the likely range of to- We also generally prefer the W 
statistics over the Z statistics since their exact finite-sample critical 
values are available. 

Consider first the W, Z, and Z* tests that do not condition on a 
particular choice of <o. When the expansion sample is taken as a 
whole, the case for positive duration dependence appears very 

/ 
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slight. 22 Exclusion of wartime expansions leads to a reduced p- value, 
but we still fail to reject the null of no duration dependence at conven¬ 
tional significance levels. Significant duration dependence is indicated 
for prewar expansions, especially when wars are excluded, while post¬ 
war expansions show no evidence of duration dependence, regardless 
of whether wars are excluded. There is no evidence for duration 
dependence in any of the samples of contractions; however, in con¬ 
trast to expansions, there is more evidence for duration dependence 
in the postwar period (though not significant at conventional levels) 
than in the prewar period. 2S It is interesting to note that, while there is 
generally little evidence of duration dependence in either expansions 
or contractions, there is significant duration dependence over the 
entire cycle, measured peak to peak. 24 

The 2 test results are in solid agreement with those of the W test. 
The Z* test results also accord quite closely, leading us to suspect that 
most departures from the constant-hazard null hypothesis are mono¬ 
tone. 

We now report the results of the W(< () = y) and Z(t 0 = y) tests, 
which make use of conditioning information on / () . An upper bound 
(and, in fact, a reasonable choke) for t 0 is the actual shortest observed 
duration. Thus our preferred t„ value is 6 months for contractions 
and 10 months for expansions. For peak-to-peak cycles, the shortest 
duration is 17 months, which is about the sum of the shortest contrac¬ 
tion and expansion lengths. For trough-to-trough cycles, the shortest 
duration is 28 months; however, with no evidence of a distinction by 
the NBER in designating the two types of cycles, 25 we prefer a /,> of 17 
months for each type of complete cycle. This conditioning informa¬ 
tion has one important effect. The results for trough-to-trough fycles 
now closely match those obtained for peak-to-peak cycles and imply 
positive duration dependence in most samples. 


22 The nature of the deviation from exponenttality, if any, can be inferred from the 
sign of the Z statistics, which were negative for all significant or near-significant depar¬ 
tures from the null. The sign of the Z statistic is the same as the slope of the regression 
of the normalized spacings on the order, which is the inverse of the regression of the 
durations on the order. Thus negative Z statistics are associated with positive duration 
dependence. 

25 One interpretation of this result is that postwar countercyclical policy has been at 
least partially successful in terms of increasing duration dependence in contractions; 
i.e., contractions cluster around the smaller mean. 

24 As will be seen shortly, there is also strong evidence of duration dependence in 
trough-to-trough cycles, which the W, Z, and Z* tests fail to detect. This is due to the 
minimum duration of 28 months for most of the trough-lo-trough cycles, which are 
implicitly used by the W, and Z* tests as the minimum duration. The tests with lower, 
more reasonable, minimum durations do detect duration dependence in trough-to- 
trough cycles. 

28 Recall the Burns and Mitchell statement of Sec. IV. 
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The results from the Z(/o = y) test are very similar to those obtained 
with the W(to = y) test. The differences between the first three col¬ 
umns of tables 3 and 4 are very slight. Notably, the Z(t 0 = 17) column 
for whole cycles closely agrees with the W(t 0 =17) column. 

We believe that our results, which for the most part suggest whole- 
cycle positive duration dependence and half-cycle duration indepen¬ 
dence, can be fruitfully reconciled. Whole-cycle duration dependence 
can take several different forms. Clearly, if both halves of the cycle 
exhibit duration dependence, so will the whole cycle. In addition, 
duration dependence of just expansions or of just contractions (with 
no duration dependence for the other half cycle) could generate cycli¬ 
cal duration dependence. Finally, if neither half cycle displays dura¬ 
tion dependence but their lengths are negatively correlated, the 
whole cycle may display duration dependence. If duration depen¬ 
dence and weak periodicity were an important and intrinsic feature of 
the business cycle, one would expect that one of the forms would 
predominate over the sample. Our results on half-cycle duration de¬ 
pendence indicate that this is not the case; instead, the significant 
whole-cycle duration dependence appears to be a mixture of all these 
possibilities. The slightly significant prewar expansion duration de¬ 
pendence and the almost significant postwar contraction duration 
dependence coupled with a slight negative correlation between half¬ 
cycle durations drive the whole-cycle results. 26 This clearly qualifies 
our whole-cycle results since it admits the possibility that they are a 
spurious coincidence of several factors. 

VI. Conclusion 

Our examination of the complete samples of expansions and contrac¬ 
tions uncovered little evidence for duration dependence, which sug¬ 
gests that the maintained assumption of constant Markov transition 
probabilities in Hamilton (1989) is legitimate. In the postwar sample, 
our results indicate that this assumption appears to be particularly 
valid for expansions and perhaps less so for contractions, although 
the very small size of these samples may impair the power of the tests. 

In contrast to our results for expansions and contractions, we have 
found some indication of duration dependence in whole cycles, al¬ 
though these results must be qualified by the uncertain and varying 
nature of the duration dependence. However, if durations of cycles 
are indeed more tightly clustered than those associated with an expo- 

88 Over the whole sample, the correlation between an expansion and the following 
contraction is —.21, and for a contraction with the following expansion it is - .04. 
Neither of these, however, is significant. 
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nendal distribution, then this appears to provide evidence against 
Fisher’s hypothesis of a “Monte Carlo” business cycle. The positive 
cyclical duration dependence suggests weakly periodic behavior and 
hence a business cycle that cannot be completely characterized only by 
examination of comovements among macroeconomic aggregates (as in 
Lucas [1977]). Stochastic weak periodicity, as manifested by positive 
duration dependence, may be an important feature of American 
business fluctuations, in addition to the obvious multivariate interac¬ 
tions. Here, however, more research is required to assess the eco¬ 
nomic significance of any duration dependence rather than just its 
statistical significance. 

By directly examining durations of NBER-designated expansions, 
contractions, and cycles, we beneficially avoided conditioning on a 
particular model. However, it will be of interest to ascertain the dura¬ 
tion dependence properties of various theoretical macroeconomic 
models. Models of recent vintage, whether of the new-classical, new- 
Keynesian, or real business cycle variety, are simple Frischian impulse 
propagation mechanisms. The nature of the fluctuations implied by 
such models therefore depends, of course, on the propagation struc¬ 
ture of the system and the nature of the impulses driving the system. 
It is a relatively straightforward exercise to explore the nature of 
duration dependence in the intertemporal equilibria implied by vari¬ 
ous economic models, given a filter for identifying turning points. 
There is, however, little agreement on the appropriate form of such a 
filter. The judgmental NBER filter, for example, does not have an 
exact, explicit representation. 

Similarly, it will also be of interest to ascertain the duration depen¬ 
dence properties of various statistical models commonly used as 
reduced-form descriptions of business cycle dynamics. In particular, 
although we have used the nonlinear Markov switching model to 
motivate the issues treated in this paper, questions of duration depen¬ 
dence arise naturally in many other contexts as well. Given a 
definition of turning points, for example, one would like to inquire 
about the nature of duration dependence associated with various 
linear and nonlinear, stationary and nonstationary, dynamic statistical 
models. This is especially interesting in the light of the fact that there 
is little agreement regarding an appropriate statistical model, whether 
linear or nonlinear. For example, among linear models, consensus 
has not yet been reached on the existence and importance of shock 
persistence associated with unit roots; the relative importance of the 
permanent and transient components in gross national product has 
been the subject of considerable debate (see, e.g., Campbell and Man- 
kiw 1987; Cochrane 1988; Diebold and Rudebusch 1989a). The no¬ 
tions of business cycle duration dependence introduced here may aid 



DURATION DEPENDENCE 615 

in discrimination among such competing models, via their introduc¬ 
tion of a fresh metric for comparing economic models to data. 
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International Coordination of Fiscal 
Policy in Limiting Economies 


V. V. Chari and Patrick J. Kehoe 

Federal Reserve Bank of Minneapolis and University of Minnesota 


We examine the limiting behavior of cooperative and noncoopera¬ 
tive fiscal policies as countries' market power goes to zero. We show 
that these policies converge if countries raise revenues through 
lump-sum taxation. However, if there are unremovable domestic 
distortions, such as distorting taxes, there can be gains to coordina¬ 
tion even when a single country’s policy cannot affect world prices. 
These results differ from the received wisdom in the optimal tariff 
literature. The key distinction is that, contrary to the tariff literature, 
the spending decisions of governments are explicitly modeled. 


I. Introduction 

Writing on international economic interdependence, Frenkel and Ra- 
zin (1985, p. 635) recently called for an analysis that “would deter¬ 
mine the optimal pattern of government spending . . . along the lines 
of the optimal tariff literature.” This paper is a first step in that 
direction. We consider a world economy composed of a number of 
countries in which governments choose policy to maximize the utility 
of their respective consumers. Given multiple policymakers, we need 
first to take a stand on how they interact. We contrast two polar 
regimes: In one regime, policymakers act in a coordinated fashion, 
choosing policy cooperatively to maximize world welfare. In the other 
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regime, they choose policies noncooperatively to maximize their own 
country’s welfare. As has long been recognized, the equilibria of these 
regimes may be quite different. In particular, the literature on op¬ 
timal tariffs shows that substantial distortions and a reduction in 
world welfare can result if governments cannot commit to coopera¬ 
tion. In that literature, distortions arise from the monopoly power of 
large countries. A standard result is that if countries become small 
relative to the world economy, these distortions vanish and tariff 
policies in the two regimes converge. 

In this paper we ask whether or not an analogous result holds for 
fiscal policy: Do cooperative and noncooperative fiscal policies con¬ 
verge as countries become small? Fiscal policies are modeled as 
choices of spending levels on public goods and their means of finance. 
Unlike the literature on optimal tariffs, this paper explicitly models 
the spending decisions of governments, and this difference turns out 
to be crucial to the results. 

We begin by considering a model with lump-sum taxes. Expendi¬ 
tures on public goods affect world relative prices even though the 
revenues to finance them are raised through lump-sum taxes. As 
expected, the noncooperative equilibrium yields a lower level of wel¬ 
fare than the cooperative equilibrium. For this model we show that 
the analogue of the standard tariff result holds: as countries become 
small, the distortions vanish and policies in the two regimes converge. 

We then consider a model with distorting taxes. In this case the 
tariff result does not hold: the cooperative and noncooperative poli¬ 
cies are generally different, even in the limit. This suggests that if 
there are unremovable domestic distortions, countries can gain from 
international cooperation, even in markets in which they have no 
monopoly power. Since this result differs from the standard results 
reported in the tariff literature, it is important to understand the 
intuition behind it. 

In the limiting noncooperative equilibrium, each government seeks 
to achieve two conflicting goals by using one instrument, the tax rate. 
On the one hand, governments seek to equate the marginal rate of 
substitution between consumption of private goods to the given world 
price. On the other hand, they seek to balance the welfare gains of 
providing public goods against the welfare losses from distorting 
taxes. Of course, the only way to achieve the first goal is to set the tax 
rate to zero and provide no public goods. The optimal tax rate in 
the limiting noncooperative equilibrium appropriately balances the 
trade-offs in achieving these goals. In the limiting cooperative equilib¬ 
rium, governments recognize that because of tax distortions, the 
world prite does not signal the marginal rates of substitution between 
private goods of other countries' consumers. Therefore, governments 
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do not seek, to equate consumers’ marginal rates of substitution to the 
world price; rather, they seek to equate consumers’ marginal rates of 
substitution across countries. 

Since our results with distorting taxes are at odds with received 
wisdom, it is natural to ask whether other sources of inefficiency lead 
to similar results. In an earlier version of the paper, we showed that 
for an overlapping generations economy with an inefficient competi¬ 
tive equilibrium, the noncooperative policies do not converge to the 
cooperative policies. 

This paper is related to several strands of literature. First, it is 
related to other analyses of fiscal policy in a world economy. In terms 
of strategic analyses of fiscal policy, we unify the results of Devereux 
(1986), Hamada (1986), Kehoe (1987), and Backus, Devereux, and 
Purvis (1988). However, we limit our attention to static models to 
avoid issues concerning the time inconsistency of tax/spending policy 
of the type considered by Lucas and Stokey (1983) and Persson and 
Svensson (1986). Once these simple models are well understood, it 
would be interesting to explore dynamic models of policy in which a 
key ingredient is the interaction between time inconsistency and coop¬ 
eration. Rogoff (1985) and Kehoe (1989) provide examples of this 
type of analysis. 

The paper is organized as follows: Section II describes the basic 
model and establishes that noncooperative equilibria typically do not 
coincide with cooperative equilibria and that cooperative equilibria 
are optimal in a sense that noncooperative equilibria are not. Section 
III proves that in this model, the two solutions converge as the econ¬ 
omy is replicated. Sections IV and V present economies in which 
these solutions diverge: in Section IV divergence occurs because the 
economy is not replicated, and in Section V it occurs because of a tax 
distortion. Section VI briefly summarizes our results and suggests 
how the analysis could be extended. 


II. Monopoly Distortions 

Consider a world economy composed of a finite number of countries. 
Equilibria of this economy are compared under two regimes; in the 
first, governments set policy cooperatively; in the second, govern¬ 
ments play a noncooperative game. Under both regimes, govern¬ 
ments optimally choose policy, taking as given that for each policy 
setting, private agents are in a competitive equilibrium. The noncoop- 
eradve and cooperative equilibria can be easily computed. We solve 
first for the competitive equilibrium for an arbitrary setting of gov¬ 
ernment policy. The competitive equilibrium allocations and prices 
are then used to express the governments’ objective functions in 
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terms of their policies. We then solve for the governments’ policies 
under the two regimes. 

A. Competitive Equilibria for Private Agents 

Consider a world economy composed of a finite number of countries, 
l, with both private and public goods. Each country is populated by a 
large number of identical consumers, say L, and a government. For 
ease of notation, let L equal one. Consumers in each country have 
endowments of two private goods. The government of each country 
has access to a production technology that transforms the first of 
these private goods into a public good that -benefits only residents of 
the country. Each government pays for this public good by levying 
lump-sum taxes on its inhabitants. 

In particular, a consumer of country i is endowed with a positive 
amount y‘„ of each (private) good n and is taxed t' units of good 1 for i 
= 1and n — 1,2. This consumer chooses consumption levels 
of the private goods, denoted by c‘ n , for n — 1,2, and receives g' units 
of the country i-specific public good. Consumer i’s preferences over 
the consumption bundle (cj, oj, g‘) are given by u‘(c\, ci, g'). We assume 
that each u' is monotone, strictly concave, and twice continuously 
differentiable and that the marginal utility of each good goes to 
infinity as the amount of each good goes to zero. The consumer’s 
budget constraint is 

ci + pc <2 = y\ - t' + pyf (1) 

where p denotes the price of good 2 relative to good 1. The consumer, 
taking as given the price p and the tax/spending policy (t', g') of the 
government of country i, chooses private good consumption c \ and ri 
to maximize utility subject to (1). Let the demand functions for this 
consumer be denoted by e‘ n (j‘, p) for n = 1,2, where the dependence 
of these functions on the endowments is suppressed. 

The government of country i has access to a production technology 
that converts private good 1 into a country i-specific public good. For 
notational simplicity, we let this production function be linear with a 
unit coefficient. The budget constraint for the government of country 
i is g' = t 1 . Since government spending always equals taxes, govern¬ 
ment i’s policy is summarized by t’ and is referred to as either spend¬ 
ing or taxes. 

Markeftf^earing in markets for goods 1 and 2 requires 

5>i + 5> = 

i ™ i t =? i I *= i 


(2) 
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and 

/ 1 

( 3 ) 

1=1 1=1 

Let t = (-r 1 . t') and c„ = (c'„ . c'„) for n = 1,2. 

A competitive equilibrium is an allocation of private consumption (cj, 
C 2 ), a price p, and a vector of government tax/spending policies t such 
that the following conditions hold: (i) the consumption and govern¬ 
ment spending vectors satisfy (2) and (3), and (ii) for each i = 1. 1, 

the consumption allocations cl and maximize utility subject to (1), 
given t‘ and p. 

This equilibrium has (hree noteworthy features that we shall use 
later. First, for any given vector t, the market-clearing conditions 
together with the consumer demand functions implicitly define the 
equilibrium price as a function of t, say p = p{ t). Second, given the 
government’s budget constraint, we can express the maximized value 
of consumer t’s utility as 

V'(t\ p{"t)) = w'[c'i(t', p( T)), CsKt', p( T)), t']. (4) 

Third, the private consumption allocations and prices in the competi¬ 
tive equilibrium with public goods are identical to those in an econ¬ 
omy with only private goods in which country 1 consumers’ private 
good endowments are y\ - t' and y ! >, respectively, and t' enters the 
utility function as a fixed parameter. Because of this feature, the 
competitive equilibrium is clearly Pareto optimal in the class oi alloca¬ 
tions (c 1 , C 2 , t) that satisfy (2) and (3) and lake t as given. 


B. Noncooperalive and Cooperalwe Equilibria 

In Section lid, government policies were arbitrary. In this subsec¬ 
tion, however, we consider policies that are outcomes of either a non- 
cooperative or a cooperative game among governments. 

A noncooperative equilibrium is a vector of government policies t, a 
competitive equilibrium price function p( t), and vectors of competi¬ 
tive equilibrium allocation functions C|(t, p( t)) and c 2 (t, p( t)) such 
that (i) for each country i, t' maximizes (4) given t - ' = (t\ . . . , t'“ \ 
t' + 1 , . . . , t 1 ) and (ii) for every t, the resulting prices and allocations 
are a competitive equilibrium. 

In a noncooperative equilibrium, each government chooses policy 
separately to maximize its country’s objective function. In a coopera¬ 
tive equilibrium, governments instead choose policy jointly to max¬ 
imize a world objective function. We assume that the world's objective 
function is a weighted average of the individual countries’ objective 
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functions. For an arbitrary vector X of nonnegative weights, the world 
objective function is 2, X‘V"(t', p{ t)). 

A cooperative equilibrium relative to X is a vector of government poli¬ 
cies t, a competitive equilibrium price function p{ t), and vectors of 
competitive equilibrium allocation functions Ci (t, p("t)) and c 2 (t, p( t)) 
such that (i) the vector t maximizes the world objective function and 
(ii) for every t, the resulting prices and allocations are a competitive 
equilibrium. 

Although we have just defined cooperative equilibria for arbitrary 
weights, we are more interested in cooperative equilibria relative to 
particular values of these weights. Such weights respect private own¬ 
ership: they set to zero an excess savings function associated with a 
planning problem in which both private consumption and govern¬ 
ment spending are chosen. We show that cooperative equilibria rela¬ 
tive to such weights solve a planning problem. To this end, consider 
the following planning problem: For a given vector X = (X 1 , . . . , X ; ) 
of nonnegative weights, let 

/ 

W(X) = max V X'u'(f), c 2 , t 1 ) (5) 

(ci.co'T) ?ri 

subject to (2) and (3). 

Let p\ and p 2 denote the Lagrange multipliers on constraints (2) and 
(3), respectively, arid let/> = p 2 /pi be the normalized Lagrange multi¬ 
plier. Write the solution to this problem as {c,(X), c 2 (X), t(X), p(X)}, 
and call it a (world) social optimum relative to X. For each country z, 
define the excess savings function s‘(X) to be 

s'(K) = [y\ - cj(X) - t'(X)] + p(X)[yi - r*(X)J. _ (6) 

Let S denote the set of weights that yields excess savings of zero in 

each country; that is, S = {X e R+ |s'(X) = 0 for z = 1./}. Call S 

the set of weights that respect private ownership, and call a coopera¬ 
tive equilibrium relative to some X in S a cooperative equilibrium that 
respects private ownership. We then have the following proposition. 

Proposition 1. A cooperative equilibrium that respects private 
ownership is a social optimum. 

The proof of this proposition is given in the Appendix. Proposition 
1 can be restated in a slightly more precise way. For any X in S, the set 
of cooperative equilibria relative to X coincides with the set of social 
optima relative to the same X. The intuition behind the proposition 
runs something like this: The cooperative maximization problem is a 
search across policies (and therefore across competitive equilibria, 
given those policiesLfor the one that yields the highest value of the 
objective function. We know that the private consumption allocations 
of these competitive equilibria are optimal, given the government 
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policy. Only one circumstance, then, could render the cooperative 
equilibria suboptimal; that is, government policy is not chosen opti¬ 
mally. 

To understand how government spending could be chosen subop- 
timally, consider a cooperative equilibrium for an arbitrary vector of 
weights X. Recall that in our cooperative equilibrium, the only choice 
that governments make is the level of government spending. Sup¬ 
pose, instead, that we consider a cooperative equilibrium in which 
governments not only choose spending but also make lump-sum 
transfers between residents of each country. In that equilibrium, for 
any vector of weights, the governments will set spending optimally 
and then use a separate set of instruments—the lump-sum trans¬ 
fers—to achieve the optimal income distribution across countries. In 
contrast, in our cooperative equilibrium these two goals must be 
achieved by a single set of instruments: the levels of government 
spending. If the weights chosen do not respect the initial distribution 
of income, the government spending decisions are distorted. Basi¬ 
cally, countries assigned higher (or lower) weights than their endow¬ 
ments justify are compensated in utility terms by inefficiently high (or 
low) levels of government spending. In the proof of proposition 1, we 
establish that the set of weights that respects this initial distribution of 
endowments is nonempty and that the amount of government spend¬ 
ing for a cooperative equilibrium relative to such weights is optimal. 

We next show that with a fixed number of countries, the coopera¬ 
tive equilibria typically do not coincide with the noncooperative ones. 
To demonstrate this point, we compare the first-order conditions ol 
the noncooperative equilibria with those of the cooperative equilib¬ 
ria. In a noncooperative equilibrium, the government of country k 
chooses spending t* to satisf y 


dT* dp St* 


(7) 


Using the envelope theorem, we can easily transform this condition 
into 


(-, + a) + (,*- c * ) ^.o. (8) 

We call the first term in equations (7) and (8) the direct effect of a 
change in policy and the second term the indirect (or general equilib¬ 
rium) effect. The direct effect measures the impact of a change in 
policy by a government on that country’s residents at a given world 
price p. Note, however, that with a finite number of countries, a 
change in spending by one government also affects this world price. 
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The indirect effect measures the impact on residents of a change in 
the world price resulting from a change in government spending. 

Now a cooperative equilibrium that respects private ownership is a 
social optimum. Therefore, the marginal rate of substitution between 
private and public consumption must be equated to the marginal rate 
of transformation. Hence, for each country k, government spending 
r* must satisfy - 1 + (u$/u\) = 0. The wedge between these two first- 
order conditions is the term 

<y* - 4) % (9) 

which we call the monopoly distortion. 

In the noncooperative allocation the monopoly distortion drives a 
wedge between the socially optimal decision and the noncooperative 
decisions. Basically, in the noncooperative allocation, each govern¬ 
ment takes into account its effect on world prices and chooses a policy 
to influence prices in a direction that benefits its residents. In particu¬ 
lar, suppose that at the cooperative level of spending, country A is a 
net exporter of good 1. At this allocation a noncooperative govern¬ 
ment of country k would have an incentive to raise its spending a little. 
Doing so decreases the net private supply of private good 1 and raises 
the relative price of exports. In the process, country k makes itself 
better off. Likewise, if at the cooperative level of spending country k is 
a net importer of good 1, then a noncooperative government of this 
country would have an incentive to lower its spending a little. Doing 
so increases the net private supply of private good 1 and lowers the 
relative price of imports. 

In general, then, when there is a finite number of countries, the 
noncooperative and cooperative equilibria do not coincide because of 
monopoly distortions. Indeed, the only type of cooperative equilib¬ 
rium that could also be a noncooperative equilibrium is one without 
trade. In this special case, monopoly distortions disappear and gov¬ 
ernments have no incentives to distort spending decisions to affect 
world prices. 

III. Convergence in Replica Economies 

In this section we show that if the economy of Section II is replicated, 
then the monopoly distortions go to zero and the noncooperative and 
cooperative allocations converge. Consider replicating the economy 
of Section II a fixed number of times, say J. (Eventually, we let J go to 
infinity.) The J th regfica economy has countries indexed by ij for i = 
1./ and j — 1, ... ,J, where i refers to the type of Country and j 
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refers to the replication number. Ail J consumers of type i have the 
same utility functions and endowments: that is, for all; = 1,... ,J, let 
u‘ J = u' 1 and y‘ J = y li . The demand function of consumer ij for good n 
is denoted by c%(t‘, p) for n = 1,2. Market clearing for good 1 then 
requires 


X X p) + X X = 2 X y'i • do) 

J ' I ' j ' 

Market clearing for good 2 is similarly defined. These conditions 
implicitly define the equilibrium price as a function of government 
spending. We write this function as p = p(T J ), where T J = (t 11 , . . . , 
T n ; . . . ; . . . , t , j ). The objective function of the government of 

country ij is , p(T^)), where V' 1 is defined analogously to (4). 

For the replica economy, noncooperative and cooperative equilib¬ 
ria are defined as in Section II. We focus on equilibria that are sym¬ 
metric, in the sense that all countries of the same type choose the same 
policy; that is, t' j = t' 1 for all i andj. From now on, this symmetry 
requirement is understood. W'e then have the following proposition. 

Proposition 2. As the number of replications goes to infinity, the 
noncooperative equilibria converge to cooperative equilibria that re¬ 
spect private ownership. 

I he proof of this proposition is a straightlorward application of the 
definition of a replica economy, together with a little price theory. For 
any given number of replications, the noncooperative solution clearly 
coincides with the cooperative solution if and only if the monopoly 
distortions are zero. In the J th replica economy, the monopoly distor¬ 
tion for country A1 (the first replica of type k) is 


04 ‘ 


- 4 ‘) 


dpi T J ) 
dr" 


(ii) 


The proposition is proved by showing that this distortion goes to zero 
as J goes to infinity for each type k country. 

First, consider an economy with J equal to one, that is, the original 
economy. In this economy, the market-clearing condition for good 1 
is 


X 






( 12 ) 


The market-clearing conditions for the private goods define the equi¬ 
librium price function p( T 1 ) and the private consumption allocations 
{c* 1 |i = 1,To evaluate how a spending change by the govern- 
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ment of a type k country affects the equilibrium price, differentiate 
(12) to obtain 


dpi T 1 ) _ 1 - (dc\ l /d t* 1 ) 
3 t*' 1 


(13) 


Now consider an economy with J greater than one. In such an econ¬ 
omy, the market-clearing conditions (10) for good 1 define the equi¬ 
librium price function p(T^) and the private consumption allocations 
{c'>\i = 1,..., /; j = 1,. . . ,/}. To evaluate how a spending change by 
a government of a type k country, say country k\, affects this price, 
differentiate (10) to obtain 


dp(T J ) _ 1 - (ft:? 1 AH* 1 ) 


J / 

X X frVm 




;=l,-l 


(H) 


From the definition of a replica economy, c‘((p, t‘ j ) = cj 1 ip, t' 1 ) for 
all i and j, and by our symmetry assumption, t‘ j = t’ 1 for all 1 and j. 
Thus in an equilibrium of the J th replica economy, we can write (10) 
as 



1-1 I- 1 7-1 


which is equivalent to (12). That is, the competitive equilibria of the 
/th replica economy are simply the competitive equilibria of the origi¬ 
nal economy replicated J times. In particular, with concave utility 
functions, all consumers of the same type get the same allocation. 
This fact about equilibrium allocations implies 



/- I »“1 


dcj' 

dp 



(16) 


Combining (13), (14), and (16) gives 

dpiT J ) _ I dpjT 1 ) 

dT kl J dT kl ' 

Using (17) and the fact that the equilibria in the replica economy are 
the replicated equilibria of the original economy, we see that as J goes 
to infinity, the monopoly distortion (11) goes to zero for each country. 
The noncooperative equilibria thus converge to the cooperative equi¬ 
libria. 
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IV. Divergence in Nonreplica Economies 

In Section III, replication was shown to cause the cooperative and 
noncooperative equilibria to converge. The process of replication im¬ 
plies that countries become small in two ways. First, each country’s 
endowment, as a fraction of the world endowment, converges to zero. 
Second, each country’s socially optimal level of government spending, 
as a fraction of the world endowment, converges to zero. In this 
section we present a parametric example of a nonreplica economy in 
which these conditions fail and the two solutions diverge. (This ex¬ 
ample is closely related to Devereux [1986].) 

Let there be / countries (indexed i — 1./) and / private goods 

(indexed n - 1Consumers in country t own the world 
endowment of good i but own no other goods. Only the government 
of country i has access to a production technology that converts pri¬ 
vate good i into a country-specific good at a one-to-one rate. In addi¬ 
tion, let c‘ n denote the consumption of private good n by consumers in 
country i, letyl denote the country i consumer endowment of good i, 
and let tJ denote the amount of private good i that is converted by the 
government of country i into a public good. For each i, letyj = y } and 
let the utility functions be given by 

i 

u'(c\, . . . , c}, r‘,) = £ In (c‘„II) + In tJ. (18) 

n = 1 

Let p = (/>,,..., pi) denote the prices of the private goods. Consum¬ 
ers in country i solve the problem 

/ 

V'(t', p) = max V \n(c'JI) + In t| 

{«’) i 

subject to 

/ 

X Pn C 'n = PM ~ O’ ( 19 ) 

n = 1 

where c' = (ci. c)) and the consumer’s and the government's 

budget constraints are already combined. The resulting demand 
functions are c' n - p,(y\ ~ Market clearing requires 

i 

+ T " = y« for n * 1./• (20) 

i- i 

Substituting the demand functions into the market-clearing condi¬ 
tions gives the equilibrium price functions p n (T) = (y 1 - t! )!(y n n - t"), 
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where we normalized prices by setting p\ = 1. Given the price and 
demand functions, the first-order conditions for the noncooperative 
equilibrium can be rewritten as 

t! = l(y‘, ~ t,') for i = 1./. (21) 

Given our symmetry assumption, (21) implies that the noncooperative 
level of government spending is tJ = ly ' /(/ + 1). 

Now consider the cooperative solution. Given the symmetry of the 
example, any vector of weights that places an equal weight on each 
country will respect private ownership. The first-order conditions for 
this problem can be rewritten as 

tJ = y', - t| for i = 1. /. (22) 

Imposing symmetry, we see that the cooperative level of government 
spending is given by = yj/2. Thus as the number of countries goes 
to infinity, the cooperative and noncooperative solutions diverge. 

Although in this example the number of countries goes to infinity, 
each type of country maintains monopoly power over a good. A given 
country i has two sources of monopoly power over private good i. 
First, it has monopoly power in endowments: it is the only country 
with endowments of good i. Second, it has monopoly power in pro¬ 
duction: it is the only country that can convert private good i into a 
public good. Neither of these two sources of monopoly power goes to 
zero as new types of countries are added. It is possible to construct 
examples in which either source alone causes the two solutions to 
diverge, but the algebra is somewhat tedious. 


V. Divergence in an Economy with Tax 
Distortions 

In this section we describe an economy with distortionary taxes and 
show that the two solutions do not necessarily converge even though 
monopoly distortions go to zero. Consider an economy identical to 
the one in Section II except that taxes are distortionary instead of 
lump-sum. For simplicity, let all countries be identical. Since there is 
only one type of country, think of an economy with J such countries as 
the yth replica of an original economy with one country. Let the 
distortionary tax be a linear tax on the consumption of good 1. A 
representative consumer in country j (indexed j = 1 ,J) solves the 
problem 

max u J (c J t , g J ) 
k'l 


( 23 ) 
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subject to 

(1 + + pc ] 2 = y{ + pyl 

where t j is the consumption tax imposed by the government on its 
residents’ consumption of good 1. This problem yields demand func¬ 
tions cK7 ] , p) for n = 1,2. The country; government chooses taxes 
and government spending g’ to satisfy its budget constraint: 
gJ = t J c{. A competitive equilibrium is defined as in Section II. The 
market-clearing conditions implicitly define the equilibrium price as 
a function of the tax policies, say p = p( t). 

To define the government’s objective function, first substitute the 
consumer’s demand functions and the equilibrium value of the gov¬ 
ernment’s budget constraint into the consumer’s utility function to 
obtain 


V>(7 J ,p(T)) = U>[c f(T', />(t)), C^(t', p{7)), T'cl(7 J , /»(*))]. (24) 


The first-order conditions for the noncooperative level of taxes are 


av M ( av h dp _ 

dT* ty d T * 


for k 


1 . ]■ 


(25) 


Again, the first-order conditions are the sum of direct and indirect 
effects. The direct effects can be written as 


dV* 

dr* 



and the indirect effects as 


1 

1 + T* 


“If 

u\\ 


1 + 


T* df 1 ? \ 
ci dT* /. 


(26) 


dV k dp 

dT* 


U* 


1 + 


- 


(yS - + 


dT" 


U S T - 

dp 


dp 

dT* 


From the market-clearing conditions, we have 

dp = -[c{ + (1 + T*)(dc*/dT*)] 

dT* M 

2,(1 + T')(dc[!dp) 


(27) 


(28) 


Recall that the direct effects measure how a change in government 
policy affects that country’s residents at a given world price, whereas 
indirect effects measure how a policy change affects residents by af¬ 
fecting the world price. With distortionary taxes, both effects are 
changed. The direct effects no longer imply that the marginal rate of 
substitution should be equated to the marginal rate of transforma¬ 
tion. Rather, these terms are modified by the elasticity of consump¬ 
tion with respect to the distortionary tax. The indirect effects are now 
composed of two terms. The first term in (27) is analogous to the 
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indirect effect in (8); both represent monopoly distortions. The sec¬ 
ond term in (27), called the tax distortion effect, measures how much the 
price changes that result from a tax change affect utility by changing 
the level of public goods provided. Note that if consumption of good 
1 were completely inelastic with respect to its price, this second distor¬ 
tion would be zero, leaving only the monopoly distortion. 

Compare this solution with the cooperative solution. Given the sym¬ 
metry of the example, equal weights respect private ownership. We 
consider a symmetric solution in which all policies are the same. The 
first-order conditions for the cooperative allocation are 


. J 

av^ + y = 0 

<?T* “1 dp dT k 


(29) 


In contrast to the model in Section II, the extra distortion that results 
from taxes causes the indirect effects not to cancel. Indeed, the sum 
of indirect effects is 


V dV> d P 

jrii dp dT* 



dp 

at*' 


(30) 


This sum of the tax-induced distortions causes the two solutions 
to diverge. To see this, let p{T J ) represent the equilibrium price 
function with J identical countries. As in proposition 2, dp( T7)/5t* = 
J~ l dp(T l )/dT k . Using symmetry, we see that the noncooperative solu¬ 
tion is given by 


1 + T U\ 


+ ^ 1 +^41-' 


[- 

and the cooperative solution is given by 


Cl dT 


dc { \ 

, dp{ T 1 ) 

dp) 

' dT 

dc\ \ 

, a/>(T l ) 

dp) 

' dT 


= 0, „ (31) 


= 0. (32) 


The wedge between these solutions is 

J ~ 1 / u 3 r_ dci \ dp( T 1 ) 
J \u, ci dp j dr 


(33) 


We can use (28) to show that, in general, this wedge is nonzero and 
thus these two solutions diverge as the number of countries J goes to 
infinity. It is worth pointing out that in the special case of Cobb- 
Douglas utility, the relevant income and substitution effects cancel 
and this wedge is zero. 

The intuition for this result is as follows: Substituting (28) into (29) 
and using (30) gives m 3 /«i = 1; that is, in a cooperative equilibrium, 
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the marginal rate of substitution between private and government 
consumption is equated to the marginal rate of transformation. Thus 
the cooperative equilibrium with distortionary taxes has the same 
allocations as the cooperative equilibrium with lump-sum taxes. The 
equivalence follows from our symmetry assumptions. 

The noncooperative equilibrium allocations generally differ from 
the cooperative ones. To see the difference, consider the limiting 
noncooperative equilibrium. Since each government chooses its tax 
rate taking the world price as given, the last term on the left side of 
(31) is zero. This implies that in the limiting noncooperative equilib¬ 
rium, the marginal rates of substitution between private and govern¬ 
ment consumption are not equated to the marginal rate of transfor¬ 
mation. Governments choose not to equate these marginal rates 
because they have one instrument—the tax rate—and two conflicting 
goals. On the one hand, governments seek to equate the marginal rate 
of substitution between the private goods to the world price; on the 
other hand, they seek to set u 3 /u 1 equal to one. The first goal can be 
met only by setting the tax rate equal to zero; achieving the second 
means distorting the marginal rates of substitution between private 
goods away from the world price. The optimal policy in the noncoop¬ 
erative equilibrium appropriately balances these two goals, achieving 
neither completely. In contrast, in the limiting cooperative equilib¬ 
rium, governments recognize that because of tax distortions, the 
world price does not signal the marginal rates of substitution between 
the private goods of other countries' consumers, Thus governments 
do not seek to equate the marginal rates of substitution to the world 
price; rather, they seek to equate consumers’ marginal rates of sub¬ 
stitution across countries. By appropriately adjusting the tax rates in 
all the countries simultaneously, they can achieve these two goals. 

There is an alternative way to see why the cooperative and nonco¬ 
operative allocations differ. Suppose that all governments are initially 
at the cooperative equilibrium. We show that, even in the limit, a 
single government can deviate from this equilibrium and make itself 
better off. Suppose, for now, that there exists a feasible policy change 
by a government that increases C| + g by a small amount. With a 
Taylor series expansion, the change in utility for that country is given 
by 

Au = u\dc\ + u^dci + u 3 dg. (34) 

To evaluate this expression, note that since we started at the coopera¬ 
tive equilibrium, u s = u\, while from the consumer’s and the govern¬ 
ment’s budget constraints we have 

dc\ + dg + pdc 2 = 0, 


(35) 
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and from the consumer’s first-order condition we have u 2 /«i = 
/>/( 1 + t). Substituting (35) into (34) and simplifying gives 

Au = u,(dc, + dg){\ - t | (36) 

Since taxes are positive in the cooperative equilibrium, (36) implies 
that if there is a policy change that increases C[ + g, then such a 
change increases utility- It is easy to show that such a policy change 
exists if the price elasticity of demand for good 1 is different from 
one, that is, if preferences are not Cobb-Douglas. With Cobb-Douglas 
preferences, however, the demand function for good 1 is Ci = a(yi + 
py 2 )/(l + t), where a is the share of private expenditure on good 1. 
Combining this demand function with the fact that c | + g = (1 + t)c|, 
we see that no change in tax rates can change ci + g. With any other 
preferences the cooperative equilibrium will not be a nbncooperative 
equilibrium. 

It is worth noting that all results in this section hold even if the 
instrument available to governments is a tariff rather than a con¬ 
sumption tax. We consider a consumption tax rather than a tariff for 
two reasons: First, for notational convenience, we want to examine a 
model with identical countries; obviously, a tariff cannot raise reve¬ 
nues if there is no trade. Second, with identical countries, there is no 
monopoly distortion effect; consequently, the only source of distor¬ 
tion lies in the way taxes distort private decisions. Since we wanted to 
focus on this issue, we considered a consumption tax. 

Consider the connection between our results and those in the tariff 
literature. There are two distinctions between our model and those in 
the tariff literature. First, the tariff literature assumes that govern¬ 
ments can levy lump-sum taxes (and transfers) as well as distorting 
taxes. Second, in that literature, government spending is exogenous. 
In the limit, governments have no monopoly and thus cannot alter 
world prices in their favor. Therefore, in the limit, governments will 
not use distorting taxes to finance spending if lump-sum taxes 
are available. To make the comparison interesting, suppose that gov¬ 
ernments can make only nonnegative lump-sum transfers but govern¬ 
ment spending is still exogenously fixed. Clearly, in the limit no 
government will levy a tax above that needed to finance government 
spending, so the noncooperative and cooperative equilibria coincide. 
When government spending is endogenous, however, a tax distortion 
effect similar to the one just analyzed will cause the two equilibria to 
differ. 

It is worth pointing out that these results are also related to a 
literature in mathematical economics that characterizes Walrasian 
equilibria as the limit of noncooperative equilibria (see, e.g., the sym- 
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posium in vol. 22 of th e Journal of Economic Theory [April 1980]). To 
clarify this relationship, consider the following two-stage manipula¬ 
tion game in an exchange economy inhabited only by private agents. 
In stage 1, the agents decide how much of their endowments to de¬ 
stroy. In stage 2, given their remaining endowments, they participate 
as price takers in a competitive equilibrium. This manipulation game 
is closely related to the games we study here. Indeed, there may be a 
way to adapt the results in this literature to prove a more general 
version of some of our results. 

VI. Summary and Conclusions 

In this extension of the analysis of tariff policy to models of fiscal 
policy, we have made two major points: First, if lump-sum taxes are 
available, then the basic results on tariff policy carry over to fiscal 
policy; as each country becomes small in the world economy, the non- 
cooperative allocations converge to the cooperative allocations. Sec¬ 
ond, if revenues must be raised through distorting taxes, then these 
solutions generally do not converge. 

We have made these points in simple models, but the intuition 
behind them is broader. In the limiting noncooperative equilibrium, 
each government uses a distorting tax to attempt to achieve two con¬ 
flicting goals. Each government seeks to provide an optimal level of 
government spending and, at the same time, to equate the marginal 
rates of substitution of its consumers to the world price. Since other 
countries must also use distorting taxes, the world price does not, 
however, ^reflect the marginal rates of substitution of consumers in 
other countries. T hus there is a loss of efficiency relative to the coop¬ 
erative equilibrium. Similar results may hold for other types of distor¬ 
tions, such as incomplete markets. 

Throughout the paper we restricted our analysis to static models to 
avoid problems associated with the time inconsistency of optimal pol¬ 
icy. Rogoff (1985) and Kehoe (1989) have shown in dynamic settings 
that cooperative equilibria may be Pareto-dominated by noncoopera¬ 
tive equilibria. An essential ingredient for this nonoptimality result is 
that policy in the cooperative equilibrium must be time inconsistent. 
In contrast, we attempt to isolate and understand factors that cause 
noncooperative equilibria to diverge from cooperative equilibria. Our 
main finding is that such a divergence result can hold in settings with 
distorting taxes. In particular, we show that the divergence result can 
hold even in a static model. Of course, in dynamic models with distor¬ 
tions, both of these results can hold simultaneously. Integrating these 
literatures would enable us to identify the benefits and costs of coop¬ 
eration in policymaking. 
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Proof of Proposition 1 

To prove proposition 1, our basic line of argument is as follows: First, we 
consider a cooperative equilibrium in which governments are allowed to 
make transfers between countries. In lemma 1, we show that for any vector of 
weights this equilibrium is a social optimum. In lemma 2, we show that a 
nonempty set of weights exists for which the optimal transfers in such a coop¬ 
erative equilibrium are zero. Combined, these two lemmas give us proposition 

1. (We view these equilibria with transfers simply as a convenient construct 
for proving proposition 1, not as particularly interesting in their own right.) 

To set up lemma 1, we need several definitions. We must first define a co¬ 
operative equilibrium with transfers relative to any nonnegative vector of 
weights X. For brevity, call this a X-cooperative equilibrium with transfers. 
This equilibrium is composed of a competitive equilibrium for private agents 
and a cooperative equilibrium for governments. We begin with the competi¬ 
tive equilibrium. Let x‘ denote the amount of good 1 that each agent in 

country i transfers to the rest of the world. Let x = (x 1 . x 1 ), with x' = 

— Sfli 1 x', be the vector of such transfers. For a given vector of government 
spending t and transfers x, a competitive equilibrium is an allocation of 
private consumption (C), c-i) and a price p such that the allocations solve 

W(t \x\p)= max u'(c\, c' 2 , t 1 ) (Al) 

ki. ci} 

subject to 

c\ + pc\ = /, - t' - x' + py't 

and satisfy the market-clearing conditions (2) and (3). Substituting the de¬ 
mand functions into the market-clearing conditions gives the equilibrium 
price as a function of government spending and transfers, say p = p(r, x). We 
then define a X-cooperative equilibrium with transfers as a policy vector (t, 
x), a price function p( t, x), and allocation functions c, (t, x, p( t, x)) and c 2 (t, 
*./>( T, x)) that satisfy the following conditions: (i) the vector (t, x) maximizes 

2, X'V'Or’, x 1 , p( t, x)), and (ii) for each vector (t, x), the resulting prices and 
allocations constitute a competitive equilibrium. Next, a X-social optimum is a 
vector (t, Cj, c 2 , />), where the allocations maximize (5) subject to (2) and (3) 
and where p denotes the normalized Lagrange multiplier for these con¬ 
straints. Notice that any vector (t, Cj, C2, p) that is a X-social optimum satisfies 
(2), (3). and 

=2 = p for k = 1./. (A2) 

M* 

With these definitions, it is straightforward to establish the first lemma. 

Lemma 1. For any nonnegative vector of weights X, a X-cooperative equilib¬ 
rium with transfers is a social optimum. 

Proof. The cooperative equilibrium allocations must satisfy all the condi¬ 
tions for a competitive equilibrium, while the allocations in the social op¬ 
timum must satisfy only market clearing. Thus for any X we have W(X) ^ 
2, WOr*. x‘, p(T, x)) evaluated at the cooperative policies (t, x). If we can 
choose transfers suchlhat the X-social optimum together with the transfers is 
a X-cooperative equilibrium, we are done. To this end, let (4, jti, dj, p) be a X- 
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social optimum. We claim that (4, 4, £ t . £ 2 , p) is a X-cooperative equilibrium, 
where 

£* = >1 - ^ - r* + p(yl - cl) for A = 1_, (A3) 

To see this, note that a X-social optimum satisfies market clearing, the con¬ 
sumers' first-order conditions (A2), and, by the definition of transfers, the 
private-sector budget constraints. Hence (£), £ 2 , p) is a competitive equilib¬ 
rium given (4, 4). Since (4, 4) are feasible choices in the cooperative environ¬ 
ment, it follows that I, k'V'^', p(4, 4)) & W(X). Q.E.D. 

In the next lemma, we show that the set of weights that respect private 
ownership is nonempty. These weights turn out to be exactly the set of 
weights for which the optimal transfers of lemma 1 are zero. 

Lemma 2. There exists a nonempty set S of nonnegative weights X such 
that, for each X in S, the excess savings of each country are zero. 

Proof. The proof is a fairly standard application of a fixed-point theorem 
along the lines of Negishi (1960) and Mantel (1971). Recall that the excess 
savings function of the tth country is 

s‘(k) = [y‘, - cj(X) - t-(X)] + p(X)[y£ - 4(A)] (A4) 

and is defined for all X in A, where A = {X £ R‘\k' > 0, and 2f» | k‘ = 1}. 
These excess savings functions have three properties that are exploited in the 
proof. First, they are continuous functions of X. Second, feasibility implies 
that they sum to zero. Third, these functions satisfy the condition that if k' = 
0, then s‘(k) i 0. That is, if consumer i receives a zero weight in the social 
optimum, then consumer z’s excess saving is nonnegative. 

Next, define the fixed-point map g: A -> A, where g = (g 1 . g') and 


ff’<X) 


max[0, k' + r'(X)] 

i 

max[0, k> + s-'(X)] 

/-I 


(A5) 


Notice that’ the denominator in (A5) is always positive. This is true because 
2 ; [X-' + s ; (X)] = 1 implies k 1 + s'(k) > 0 for some;, which in turn implies 
that the denominator is positive. Since both the savings functions and the 
maximum function are continuous, the function g is continuous. Since the 
g‘(k) are nonnegative and sum to one, we know that g(X) is in A. Thus g is a 
continuous function that maps the compact, convex set A into itself. So by 
Brouwer's theorem, we know that there is a nonempty set S of weights such 
that g(X) = X for all X in S. 

To finish the proof, we must show that a fixed point of g is a zero of s'; that 
is, g‘(X) = X‘ for all i implies s‘(k) = 0 for all t. If X is a fixed point of g, then, 
for all t, ak' = max[0, X' + .v'(X)], where a is the denominator in (A5). This 
implies that ak' = X' + s'(X) for all i, since we know that if X' = 0, then s‘{k) a 
0. Summing over all consumers gives aS, X 1 = 2, X 1 + 2, r'(X). Since the sum 
of these savings functions is zero, we have a = 1; thus s'(X) = 0 for all i. 
Q.E.D. 

Combining these two lemmas gives us proposition I. 

Proposition 1. A cooperative equilibrium that respects private ownership 
is a social optimum. 

Proof. A more precise statement of the proposition is that for any X in S, a 
X-cooperative equilibrium (without transfers) is a X-social optimum. Compar- 
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ing (A3) and (A4), we see that the transfers used to support a given A-social 
optimum are simply the excess savings resulting from that optimum. Thus by 
lemma 2 for any A in S, these optimal transfers are zero; so for such a A, a A- 
cooperative equilibrium with transfers is a A-cooperative equilibrium (with¬ 
out transfers). Then by lemma 1, such a cooperative equilibrium is optimal. 
Q.E.D. 


References 

Backus, David K.; Devereux, Michael; and Purvis, Douglas. "A Positive The¬ 
ory of Fiscal Policies in Open Economies." In International Aspects of Fiscal 
Policy, edited by Jacob A. Frenkel. Chicago: Univ. Chicago Press (for 
NBF.R), 1988. 

Devereux, Michael. “The International Coordination of Fiscal Policy and the 
Terms of Trade: An Example." Manuscript. Toronto: Univ. Toronto, 
1986. 

Frenkel, Jacob A., and Razin, Assaf. “Government Spending, Debt, and In¬ 
ternational Economic Interdependence." Enm. ]. 95 (September 1985): 
619-36. 

Hamada, Koichi. “Strategic Aspects of International Fiscal Interdepen¬ 
dence.” Econ. Studies Q. 37 (June 1986): 165-80. 

Kehoe, Patrick J. "Coordination of Fiscal Policies in a World Econotm ,”/. 
Monetary Econ. 19 (May 1987): 349-76. 

-. “Policy Cooperation among Benevolent Governments May Be Unde¬ 
sirable.” Rev. Econ. Studies 56 (April 1989): 289-96. 

Lucas, Robert E., Jr., and Stokey, Nancy L. "Optimal Fiscal and Monetary 
Policy in an Economy without Capital.”/. Monetary Eton. 12 (|ulv 1983): 
55-93. 

Mantel, Rolf R. "The Welfare Adjustment Process: Its Stability Properties." 
Internat. Earn. Rev. 12 (October 1971): 415—30. 

Negishi, Takashi. “Welfare Economics and Existence of an Equilibrium for a 
Competitive Economy.” Metroeconomua 12 (August-December 196(F): 92- 
97. 

Persson, Torsten, and Svensson, Lars E. O. “International Borrowing and 
Time-Consistent Fiscal Policy.” Scandinavian J. Econ. 88, no. 1 (1986): 273- 
95. 

Rogoff, Kenneth S. “Can International Monetary Policy Cooperation Be 
Counterproductive?"/. Internal. Econ. 18 (May 1985): 199-217. 



Job Search Outcomes for 
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This paper examines how four components of the job search pro- 
cess—the choice of search methods, the choice of how many firms to 
contact, the rate at which of fers are received, and the acceptance or 
rejection of an offer—influence the job-finding rate. A reduced- 
form model of job search is estimated that takes account of the fact 
that users of a particular method of job search are not a random 
subset of all searchers. 1 he empirical analysis focuses on differences 
in search behavior between the employed and unemployed. A key 
finding of the analysis is that the offer rate per contact is greater for 
employed searchers than for unemployed searchers. This may be 
due to differences in the eflectiveness of search while employed 
versus unemployed or to unobserved differences m search effort. 
Further research on this issue is needed because many models of job 
search behavior are based on the assumption that job search is more 
effective when one is unemployed. 


I. Introduction 

Individuals searching for a job have a number of choices to make 
concerning the search process. These choices include which methods 
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of search to use, how much effort to devote to each method of search, 
which firms to contact first, how many offers to collect before making 
an acceptance decision, and a criterion for deciding what constitutes 
an acceptable offer. Models of job search have supplied theories of 
many of these choices. These theories specify an objective function to 
be maximized, such as the expected discounted value of wealth or 
utility, and the constraints on the maximization, such as search costs 
or search technology and the budget set, and they derive rules for 
optimal choices from the constrained maximization problem. Search 
theories following this approach have been used to model the choice 
of acceptance criterion (e.g., Lippman and McCall 1976), search in¬ 
tensity and the number of offers to collect (Gal, Landsberger, and 
Levykson 1981; Benhabib and Bull 1983; Morgan 1983; Stern 1989), 
and which firms to contact first (Salop 1973). A considerable amount 
of empirical research has been based on such models. Some re¬ 
cent empirical studies have estimated the parameters of struc¬ 
tural job search models (e.g., Flinn and Heckman 1982; Jensen and 
Westerg&rd-Nielsen 1987; Wolpin 1987; Blau 19896; Stern 1989). 
The most common empirical approach to job search behavior, how¬ 
ever, is estimation of reduced-form models in which particular job 
search outcomes, such as the duration of search and the accepted 
wage, are regressed on characteristics of the searchers and the search 
environment. 

Despite the large number of studies addressing specific aspects 
of the job search process, no single study has provided a consistent 
framework for separating out the effects of searcher characteristics 
on the different stages of the search process that ultimately determine 
the job-finding rate. In this study, we provide a consistent framework 
for analyzing four components of the search process: the choice of 
search methods, the choice of how many firms to contact given that a 
particular set of search methods is used, the rate at which offers are 
received given the contact rate, and the decision to accept an offer 
given the offer rate. These four components together determine the 
job-finding rate. 

In addition to developing a consistent empirical model of these 
four aspects of the job search process, we address an issue that has 
received some attention in the recent job search literature: the relative 
effectiveness of employed versus unemployed search. This issue is 
important because the validity of the theory of search unemployment 
rests implicitly on the notion that unemployed search is more effec¬ 
tive than employed search. As Clark and Summers (1979) have noted, 
if searching while employed is as effective as searching while unem¬ 
ployed, then the optimal strategy for a person seeking a new job is to 
accept the first offer received and to continue to search while em- 
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ployed. Thus an empirical examination of the relative effectiveness of 
employed and unemployed search should shed some light on the 
validity of the theory of search unemployment. 

Surprisingly few empirical studies have examined the effectiveness 
of employed and unemployed search. One study, by Holzer (19876), 
finds considerable evidence that searching while unemployed is more 
effective than searching while employed. Using data for youth from 
the 1981 panel of the National Longitudinal Survey (NLS), Hol¬ 
zer finds that the unemployed search more extensively (use more 
methods of search), search more intensively (search longer hours), 
collect more offers, and accept more offers. Overall, unemployed 
searchers are more than twice as likely to gain new employment as 
employed searchers. Holzer interprets his findings as providing con¬ 
siderable support for the theory of search unemployment. 

In this paper, we follow Holzer's approach of examining a broad 
range of search outcomes in order to analyze the relative effectiveness 
of employed and unemployed search. Our study differs from Hol- 
zer’s in several ways. First, we utilize a different data set and analyze a 
slightly different set of search outcomes. Second, we extend Holzer’s 
analysis by including adults as well as youth. Third, as indicated 
above, our empirical analysis provides a consistent framework for 
separating out the effects of observed searcher characteristics on the 
various components of the search process that determine the job¬ 
finding rate. 

In general, we find fairly small differences in the search behavior of 
employed and unemployed individuals. For some dimensions of job 
search, we find that employed search appears more effective than 
unemployed search. In particular, contrary to the findings of Holzer, 
we find that employed searchers are significantly more likely to gener¬ 
ate job offers and are significantly more likely to find new employ¬ 
ment than unemployed searchers, even though they tend to use fewer 
methods of job search and contact fewer firms. 

Our results raise some questions about the theory of search unem¬ 
ployment (see Lippman and McCall 1976) that need to be addressed 
in future studies. Is employed search really more effective than un¬ 
employed search, or is the difference accounted for by unobserved 
characteristics related to search effort that are correlated with em¬ 
ployment status? If employed search is more effective than unem¬ 
ployed search, then why don’t unemployed searchers simply accept 
the first offer they receive and continue to search while employed? 
Our results indicate that unemployed searchers reject a sizable num¬ 
ber of offers, and after accepting a job, most quit searching. 

The remainder of this paper is organized as follows. Section II 
describes the data used to estimate the search model. Section III 
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presents basic search patterns for the employed and unemployed. 
Section IV presents the estimation results. Section V concludes the 
paper. 


II. Data 

The data used in this study are from the Employment Opportunity 
Pilot Projects (EOPP) baseline household survey. The EOPP survey 
collected data from a sample of almost 30,000 families in 20 geo¬ 
graphically dispersed sites throughout the United States, from April 
through October 1980. The purpose of the survey was to provide 
baseline data in pilot and control sites for an evaluation of EOPP, 
which provided intensive job search assistance and training to low- 
income individuals in the pilot sites. 1 The EOPP program was phased 
out in 1981, but the baseline survey provides a rich source of data for 
the analysis of job search behavior. 2 * 

The EOPP survey collected retrospective information on a variety 
of job search activities for the period January 1979 to the date of the 
interview (April-October 1980). This paper makes use of the sub- 
sample of married men, married women, single women, and teen¬ 
agers (aged 16-19) who reported experiencing at least one spell of 
job search (either employed or unemployed) during the period cov¬ 
ered by the longitudinal labor market history. 1 The sample includes 
both complete and incomplete spells of employment and unemploy¬ 
ment. Only spells with known starting dates are included in the sam¬ 
ple. 4 


1 The EOPP survey is not a random sample ol the populations in the sites. In particu¬ 
lar, it oversampled low- and middle-income families because the pilot projects were 
testing a program targeted to welfare and welfare-eligible families. However, high- 
income families were represented in the sample so that sample truncation is not a 
problem. As Blau and Robins (1986i) point out, employment and unemployment rates 
in the EOPP sample are quite similar to aggregate rates prevailing at the time of the 
survey (1980-81). 

2 The baseline survey was conducted during the early stages of EOPP, and it is 
unlikely that the individuals surveyed had yet responded to the program. Hence, for 
purposes of analysis, the data from families in the pilot sites may be viewed as prepro¬ 
gram. It should be noted that the results in this paper refer to a particular sample and 
time period and may not be generalizable to other samples or time periods. 

5 We use the most recent job search spell to minimize problems of recall. In addition, 
if a given search spell overlaps with periods of both employment and unemployment, 
we exclude the spell from the analysis. We do this because the timing of the information 
on contacts, offers, etc. is not given and it is not possible to attribute the search informa¬ 
tion to the employment or unemployment part of the spell. Very few search spells 
overlap with periods of both employment and unemployment. For example, only about 
7 percent of all unemployment search spells become employment search spells (see 
Blau 1989s). 1 

4 We exclude left-censored spells from the analysis because several of the search 
outcomes examined are constructed using the observed length of the spell, and it is not 
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For each job search spell, the survey gathered information on the 
start and end dates of the spell (the end date is the interview date if 
the spell was still in progress at the time of the survey), the methods of 
job search used by the searcher, the number of contacts made using 
each method, the number of offers obtained using each method, and 
the acceptance or rejection of an offer using each method. The job 
search methods identified in the survey include the state employment 
service (SES), private employment agencies (PEA), friends and rela¬ 
tives (FRND), newspapers and other periodical advertisements 
(NEWS), direct employer contact (EMP), and a variety of other infre¬ 
quently used methods that we have subsumed under a single heading 
(OTH). J For each method of job search, the total reported numbers 
of contacts, offers received, and acceptances during the spell are di¬ 
vided by the length of the spell to compute the average weekly con¬ 
tact, offer, and acceptance rates. 

III. Basic Search Patterns by the Employed 
and Unemployed 

fable 1 reports basic statistics on the utilization rates of each method 
of job search by both the employed and unemployed. The utilization 
rates are given separately for each demographic group and for all the 
groups combined. Also reported are the results of tests of differences 
in the utilization rates between employed and unemployed searchers. 

As table 1 indicates, there is a sizable number of employed search 
spells, although they represent only about 10 percent of all employ¬ 
ment spells in the F.OPP data base. For men, roughly one-third of the 
search spells occur during employment, while for women (married 
and single) and teenagers, about one-fifth and one-tenth, respec¬ 
tively, occur during employment. The number of search methods 
used is very similar for employed and unemployed searchers, averag¬ 
ing 1.9 methods for employed searchers and 2.1 methods for unem¬ 
ployed searchers. This difference, however, is statistically significant 
at the 1 percent level. In contrast, Holzer (19876) finds a somewhat 
larger difference in the number of search methods used by youth. He 
reports that employed searchers use about 2.7 methods per month 
while unemployed searchers use about 3.3 methods. Our data indi¬ 
cate virtually no difference in the number of search methods used by 


clear that the search information given in the EOPP survey applies to only the uncen¬ 
sored portion of the spell. 

5 Included in the OTH category are school placement officers/teachers or professors, 
community action groups, urban league, welfare agencies, local CETA or WIN pro¬ 
gram, labor unions, civil service lest or federal job application, and other (unspecified) 
methods. 
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employed and unemployed youth (1.8 vs. 1.9 methods). The main 
difference between Holzer’s data (1981 NLS youth sample) and the 
EOPP data is that the search questions on the NLS were asked only of 
individuals who had searched for work in the month prior to the 
survey date, while the EOPP survey elicited information about search 
spells over a 16—22-month period prior to the survey date. It is possi¬ 
ble that the longer period of recall in the EOPP survey caused search¬ 
ers to forget about some methods they used. 

According to table 1, the most frequently used method of job 
search is direct employer contact, although for single women and 
employed married women, newspapers are used slightly more often. 
The least-used method among those listed is private employment 
agencies, although many of the individual categories subsumed under 
OTH are used less frequently. Interestingly, SES is used much more 
frequently by unemployed searchers, probably reflecting provisions 
of the unemployment insurance program that require recipients to 
register with the SES (see Keeley and Robins 1985). Also, among 
married men and married women, unemployed searchers are more 
likely to contact employers directly, and among all groups, the unem¬ 
ployed are less likely to use private employment agencies, perhaps 
because of their cost. 

Table 2 presents average weekly contact, offer, and acceptance 
rates for the combined sample (a breakdown by demographic group 
is available from the authors). In comparing the rates across methods, 
one must keep in mind that they do not adjust for differences in 
search intensity (hours of search). In other words, the rates reported 
in this table are calculated at the level of job search intensity chosen by 
the searcher and hence presumably reflect an optimal allocation of 
search intensity among the various methods. 

Table 2 indicates that, on average, employed and unemployed 
searchers make about the same total number ol contacts (between 2.1 
and 2.2 per week). However, offers are received more frequently by 
employed searchers, averaging about 0.30 per week compared to 
about 0.18 per week for unemployed searchers. 6 The significantly 
higher offer rate among employed searchers contrasts sharply with 
the findings of Holzer (19876), where a higher offer rate is found for 
the unemployed. Because of the higher offer rate, table 2 indicates 
that the rate of gaining new employment is also higher for employed 
searchers, averaging 13 percent per week compared to 10 percent per 


6 It is worth noting that the EOPP data report a considerably higher offer rate than 
other data (see Blau and Robins [1986a] and Blau [1989a] for a discussion). For ex¬ 
ample, Holzer (1987*) reports a monthly offer rate in the NLS data about equal to the 
weekly offer rate in the EOPP data. 
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TABLE 2 

Average Weeklv Contact, Offer, and Acceptance Rates by Method 
of Job Search (tor Users of the Method) 


Method of Job Search 


Outcome 

SES 

PEA 

FRND NEWS 

EMP 

OTH 

All 





Contact Rate 




Employed 

.60 

1.03 

.79 

1.4! 

1.67 

.90 

2.18 

Unemployed 

51 

.68 

.67 

1.36 

1.53 

.71 

2.II 

{-statistic for 








difference 

.86 

1.51 

1.52 

.38 

1.20 

1.29 
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Offer Rate 




Employed 

.08 

.19 

.18 

.15 

.19 

.13 

.30 

Unemployed 

.04 

.10 

.10 

.09 

11 

.09 

18 

(-statistic for 








difference 

2.24** 

27 1 *** 

3.66*** 4.36*** 

5.21*** 

1.66* 

7.24*** 





Acceptance Rate 



Employed 

.03 

07 

.08 

.05 

.09 

.06 

13 

Unemployed 

.02 

.04 

.07 

.04 

.07 

.05 

.10 

(-statistic for 








difference 

.73 

1.67* 

1.57 

2 33** 

2 82*** 

.28 

3.30** 


* Significant at the 10 percent level 
** Significant at the 5 pet cent level 
*** Significant at the 1 percent level. 


week for the unemployed. Again, by way of contrast, Holzer finds a 
much higher job-finding rate for unemployed searchers. 

Among the individual methods of job search, EMP and tfEWS 
generate the most contacts; PEA and EMP tend to generate the most 
offers, although a sizable number of offers are also generated by 
FRND; and FRND and EMP have the highest job-finding rates. 

For every method of search, the results in table 2 indicate that 
employed searchers collect more offers and accept more jobs than 
unemployed searchers. Hence, the EOPP data suggest that employed 
search may be more effective than unemployed search, in contrast to 
the findings of Holzer (19876), whose analysis was based on NLS data. 
However, it is important to note that the results in table 2 do not 
adjust for observed or unobserved differences in searcher character¬ 
istics that could be related to search effectiveness. At this point, it 
would be premature to conclude that a randomly chosen individual 
would be able to search more effectively while employed than while 
unemployed. Some«e*idence on this is presented below. 

Another way of examining job search behavior is to calculate condi- 
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tional offer and acceptance rates. These are given in table 3. 7 As this 
table indicates, employed searchers generate more offers and accep¬ 
tances for each contact made, while unemployed searchers are some¬ 
what less selective in accepting job of fers (perhaps because of the 
greater costs of further search). The data also indicate that while a 
majority of job offers are accepted by both employed and unem¬ 
ployed searchers, a sizable number of offers are also rejected. Among 
employed searchers, close to one-half of all offers are rejected, while 
for unemployed searchers, about one-third of all offers are rejected. 

Among the methods of job search, FRND appears to be the most 
effective for both the employed and unemployed. It generates the 
most offers per contact and the most acceptances per contact, and 
it has the highest acceptance rate per offer. Among unemployed 
searchers, three-quarters of all job offers received through FRND are 
accepted. The apparent effectiveness of FRND as a method of search 
corresponds closely with the findings of Holzer (1987a, 1988). How¬ 
ever, it should be noted that differences in offer rates across methods 
do not necessarily imply differences in effectiveness. Searchers may 
select methods on the basis of observed and unobserved characteris¬ 
tics that are associated with the productivity of the method. Also, if 
offer rates vary with the duration of search and individuals optimally 
choose search methods over time, differences in average offer rates 
could appear in the data even if the underlying rates are the same 
across methods. 

IV. A Reduced-Form Model of Job Search 
Behavior 

The estimates presented above suggest that employed search is more 
effective than unemployed search. However, these estimates do not 
adjust for differences in the characteristics of employed and unem¬ 
ployed searchers and do not take account of the fact that the choice of 
search method is endogenous. In this section, we describe and present 
estimates of a reduced-form job search model that takes these factors 
into account. 

The empirical analysis is based on the following relationship be¬ 
tween the job-finding probability and four components of the job 
search process: 

p„ = p(A\o) ll P(o\c),jt:(c\u) il pm r 0 ) 

where P y is the job-finding probability using method i (the probability 

7 These conditional rates are calculated for each individual and then averaged over 
the sample. 



TABLE 3 

Average Conditional Offer and Acceptance Rates by Method of Job Search 




Significant at the 10 percent level. 

* Significant at the 5 percent level. 
** Significant at the 1 percent level. 
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that person j accepts an offer generated by using method « during a 
given week), P(A 1 0) y is the acceptance probability (the probability 
that person j accepts an offer, given that an offer has been received 
using method i), P(0\C) v is the offer probability (the probability that 
person j receives an offer, given that an employer has been contacted 
using method 1 ), £(C|£/), ; is the contact rate (the expected number of 
contacts per week for person j using method i), and P(U) V is the 
probability that person j uses method i. 

Equation (1) defines the weekly rate of finding a job using a given 
search method as the product of the conditional acceptance, offer, 
and contact rates and the probability of using the method. The term 
Pij can be interpreted as the expected job-finding rate using method i 
of a random member of a homogeneous population . 8 If there were no 
economies or diseconomies of using multiple search methods, then 
the overall job-finding rate for person j would be P } = £, P v . 9 

We specify reduced-form equations for each of the four terms on 
the right-hand side of equation (1). Let 31 hj = A tJ IO, r where A,, equals 
one if individual j accepts a job using method i and zero otherwise, 
and 0 tJ is the number of offers received by individual j using method i. 
Let y il} = O tJ /C v , where C l} is the average number of contacts made per 
week by individual j using method i. Let y = C y , and lety 4y equal one 
if individual j uses method i and zero otherwise. The variables y* y , 
k = 1,.... 4, are the observed realizations of the terms on the right- 
hand side of equation (1). Our statistical model for these observed 
variables is specified as 

y I17 — ^;Pl "b ^l</> 

(1 * A, s 1 (2) 

yu, = {A, if 1 > y*t/ > ° 

10 if OS A/ 

y% = x ; p 2 + 62 ./. 

1 if y% ^ 1 (3) 

yz,, = y% if 1 > y% > 0 
.0 if o s A,; 


8 Note that, strictly speaking, P„ can be interpreted as a probability only if the ex- 
pected number of contacts per week is less than or equal to one. The length of a period 
can always be defined so that E(C | U) is less than or equal to one. As indicated in table 2, 
the weekly contact rate exceeds one for several methods, but P n is always considerably 

less than one. . 

9 Blau and Stern (1989) show that there are apparently diseconomies involved in 
using multiple search methods. 
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y*tj — X/Ps + esy, 


ysij 


f y% ^ A, > 0 


10 if y* v — 0; 

y 4ij ®4l/» 


1 if>4y > 0 

0 if y% < 0. 


(4) 


(5) 


In each of the four models above, y*,j is a latent index, p* is a 
parameter vector, X, is a vector of explanatory variables, and e*, ; is a 
disturbance. Our model is reduced form in nature, so the same set of 
variables appears in each equation. 10 

We assume that k = 1, . . . , 4, are jointly normally distributed 
with zero means and a typical element of the covariance matrix given 
by ^(e^yC/y) = cr*/. 11 The normality assumption means that equations 
(2) and (3) are two-limit tobit models, equation (4) is a single-limit 
tobit model, and equation (5) is a probit model. The two-limit tobit 
models are used to account for cases with multiple contacts or offers, 
recognizing that at most one offer can be accepted and at most one 
offer can result from a contact. Note that equation (2) for the accep¬ 
tance rate per offer using method i can be estimated only on the 
sample of people who have received at least one offer using method i, 
that is, only for cases with y^,, > 0. Similarly, equation (3) for offers per 
contact can be estimated only on the sample for which the number of 
contacts from method 1 is positive (y$, t > 0), and equation (4) for 
contacts can be estimated only on the sample that uses method i (j/ 4 ,, — 
1). Thus if the disturbances are in fact correlated across equations, 
then the use of standard estimation methods for equations (2)-(4) 
would result in inconsistent estimates because of sample selection. 

An important point to note about the specification in equations (2)— 
(5) is the absence of duration dependence. The EOPP survey, as 
noted above, reports the total number of acceptances, offers, and 
contacts generated from each method used for each search spell but 
not the timing of these events. In the absence of data on the timing of 
the contacts and offers, disentangling the effects of true duration 
dependence from spurious duration dependence induced by unob¬ 
served heterogeneity is difficult and is not attempted here. 


10 Holzer (19876) estimates equations explaining the choice of search methods and 

includes as an explanatory variable the predicted number of offers based on a first- 
stage regression. Hence, his model is not a reduced form and is based ;on (apparently) 
arbitrary exclusion restrictions. 1 

11 Note that <r« is normalized to unity since it is not identified. 
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Estimation of equations (2)-(5) was accomplished as follows. Con¬ 
sistent estimates of the P’s and the covariance matrix of the distur¬ 
bances were obtained by performing bivariate maximum likelihood 
separately on each pair of equations. The resulting estimates of the 
P’s were very similar to those obtained from single-equation esti¬ 
mates, but the estimated standard errors were substantially smaller. 
Of course, full-information maximum likelihood would have been 
still more efficient but cumbersome and costly because of the four- 
variate integration required. 

Equations (2)—(5) were estimated for each of the six different 
search methods and for all methods combined. In the combined 
methods sample, equation (5) was replaced by an ordinary least 
squares equation explaining the number of methods used (rather 
than the probability of using a given method). Computational feasibil¬ 
ity precluded joint estimation across methods. 12 Each equation was 
estimated separately for employed and unemployed searchers, thus 
allowing all coefficients to differ between the two groups. The equa¬ 
tions were also estimated on the combined sample of employed and 
unemployed searchers with an employment status dummy variable 
(1 = employed) included on the right-hand side of the equation. 
By including employment status of the searcher as an explanatory 
variable, we assume that it is exogenous to the choice of search 
method and outcomes. 1 ' 

The explanatory variables included in each equation are dummy 
variables for demographic groups (married women, single women, 
and youth aged 16-19, with married men the omitted category), race 
(black and Hispanic, with white the omitted category), location (living 
in a standard metropolitan statistical area), receipt of unemployment 
insurance (unemployed searchers only), receipt of welfare (Aid to 
Families with Dependent Children or food stamps), years of work 
experience, years of education, average monthly nonwage income 
during the search spell, the local unemployment rate at the time of 
the spell, and the hourly wage at the end of the most recent previous 
job held (for unemployed search spells) or at the beginning of the 
current job (for employed search spells), if available. 11 

12 Blau and Stern (1989) use the method of moments approach to estimate search 
method choice and outcomes jointly for all the methods. 

13 The employment status of a job searcher could be related to the perceived produc¬ 
tivity of searching while employed vs. unemployed. 11 there is heterogeneity in unob¬ 
served components of search productivity, then employment status while searching 
should be treated as an endogenous variable. Our conclusions regarding the relative 
effectiveness of employed vs. unemployed search should be interpreted with this caveat 
in mind. 

14 If the previous wage is unavailable, then a dummy variable ts set to one and the 
wage variable is set to zero. 
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The results for all search methods combined are presented in table 
4 (the results for each search method are in an appendix available 
from the authors). The table also reports the estimated standard devi¬ 
ations of the error terms (o**) and the estimated correlation coeffi¬ 
cients of the error terms across equations (p*/). 

According to the results in table 4, with other characteristics held 
constant, married men use the most methods of search and make the 
most contacts, but they receive and accept the fewest offers. Blacks 
use more methods of search than whites but make fewer contacts, 
receive fewer offers, and accept fewer offers. Overall, the results 
imply that the main source of the lower job-finding rate for blacks is a 
lower contact rate. Although, in contrast to blacks, Hispanics use 
fewer methods of search than whites, like blacks they make fewer 
contacts and accept fewer offers. 15 

Searchers receiving unemployment insurance benefits use more 
methods of search (probably reflecting the search provisions of the 
unemployment insurance program) but make fewer contacts and re¬ 
ceive fewer offers. The same pattern is observed for welfare recipi¬ 
ents. As a consequence of the lower contact and offer rates, unem¬ 
ployment insurance and welfare recipients have a lower probability of 
finding a job. Numerous studies have found that receiving transfer 
benefits is associated with a lower escape rate from unemployment. 
Our results confirm this finding but provide additional evidence on 
the source of the lower job-finding rate. In the detailed results by 
search method available from the authors, it is found that both unem¬ 
ployment insurance and welfare recipients use more methods of 
search because of a much higher probability of using the SES. Most of 
them are required to register with the SES in order to receive benefits. 
Note that the acceptance rate per offer of these recipients is not 
significantly different from that of nonrecipients, suggesting either 
that reservation wages do not differ on average between recipients 
and nonrecipients or that regulations imposed by program officials 
effectively prevent recipients from turning down too many offers. 


15 A more detailed breakdown of black-white differences is possible using the estima¬ 
tion results separately by method (available from the authors). These results indicate 
that the main source of the lower job-finding rate for unemployed blacks is a lower 
frequency of contacting employers directly and substantially fewer offers per contact 
via newspaper advertisements. However, the lower job-finding rate of unemployed 
blacks is also due in part to lower acceptance rates per offer for four of the six methods. 
Blacks have a higher acceptance rate per offer only via the OTH method, which 
includes a variety of government programs aimed at the disadvantaged. These results 
are similar in some respects to ftilzer's (1987a), but our results indicate that use of 
friends is at least as productive for ’tmemployed blacks as for whites. Holzer found that 
for men aged 16-23 a lower offer rate via friends accounts for a substantial share of the 
lower overall job-finding rate of blacks. In the case of employed searchers, we do find 
that blacks have significantly fewer contacts, offers, and acceptances via friends. 
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This could rationalize the apparent search strategy of generating 
fewer offers per contact if the goal is to remain unemployed longer. 

The human capital variables have mixed effects. Higher-wage indi¬ 
viduals use fewer methods of search and make fewer contacts, but 
they are able to generate more offers. However, higher-wage individ¬ 
uals are no more likely to accept an offer than lower-wage individuals. 
Education is positively related to the number of methods used, the 
number of contacts made, and the number of offers received, but it is 
negatively related to the conditional acceptance rate. Work experi¬ 
ence has a negligible effect on all the outcome measures. 

The state of the local economy also affects search decisions. When 
the local unemployment rate rises, individuals tend to search more 
extensively (use more methods), but they generate fewer contacts and 
offers. A higher unemployment rate also induces searchers to reject 
fewer job offers. 

After we adjust for the effects of observed variables and sample 
selection, the results indicate that the employed search less exten¬ 
sively, generate fewer contacts, receive more offers, and have a condi¬ 
tional acceptance rate similar to that of the unemployed. Recall that 
the unadjusted results indicate less extensive search by the employed, 
a similar contact rate, a higher offer rate, and a lower conditional 
acceptance rate. 

There is weak but significant correlation among the error terms in 
the four equations. Unobserved characteristics that lead to more 
methods being used also lead to more contacts, but fewer offers and 
acceptances are generated. Apparently the more extensive the search, 
the less likely the searcher is to achieve positive search outcomes. 
Successful search, therefore, seems to be more likely when fewer 
methods are used. 

Table 5 gives predicted employed-unemployed differences in the 
various search outcomes for each method of job search, derived from 
separate estimates by method available from the authors, and lor all 
methods combined. The derived effect on the unconditional job¬ 
finding probability is also given. The net effect of the four compo¬ 
nents of the search process is a slightly higher job-finding rate among 
employed searchers. This higher job-findittg rate is fully attributable 
to a higher offer rate. 

There are some differences in search behavior across the various 
methods. Unemployed searchers are much more likely to use the SES 
than employed searchers and are somewhat more likely to use NEWS 
and EMP. Employed searches are more likely to make contacts using 
PEA, but as noted earlier, few searchers use this method. In general, 
employed searchers tend to make contacts using PEA and FRND, 
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TABLE 5 

Employed-Unemployed Differences in Search Outcomes by Method of Search 
(Based on Reduced'Form Estimates) 


Method 

Probability of 
Using Method" 

Contacts per 
Week 

Offers per 
Contact 

Acceptances 
per Offer 

Acceptances 
per Week h 

SES 

— 

-.07* 

0 

-.02 

-.02 

PEA 

.01 

.37*** 

.03* 

-.02 

.01 

ERND 

0 

.08** 

.04*** 

.04*** 

.01 

NEWS 

-.05*** 

-.12** 

03*** 

0 

-.01 

EMPL 

-.03*** 

- .02 

05*** 

- 01 

.01 

OTH 

-.03*** 

08 

— .04*** 

- .02 

-.01 

All 

—.19*** 

-.15*** 

.04*** 

-.01 

.02 


Notf —(ale uIdled at the means of all cxpIriiMlnt) variables from the fitted values ol the estimated prohit, suigle- 
litntt tob/r, ancf two4wu( tobii equations Sigiufu ante levels are based on the significance of the employment status 
dummy variable in each equation 

4 Difleirnce foi all methods is based on tile riuiubet ol nietliods used 

* Derived from the results rn the previous lout columns, the sigmfitaiKe levels ate not available 

* Sigmlu antlv difierent at the 111 pen rnt level 
** Signihcantlv dilleiem at tlie *» |>eiceni lesel 
*** SiffnifKanlb different at (he I percent level 


while unemployed searchers tend to make contacts using SKS, NEWS, 
and EMP. 


V. Summary and Conclusions 

In this paper, we have examined how individual components of the 
job search process influence the probability of reemployment. A 
reduced-form model of job search is estimated that takes account of 
the fact that users of a particular method of job search are not a 
random subset of all searchers. The empirical model produces esti¬ 
mates of the parameters governing the choice of search method, the 
contact rale per method, the process by which job offers are gener¬ 
ated, and the acceptance or rejection of offers. Particular attention is 
paid to the role of observed characteristics in explaining differences 
in search behavior between the employed and unemployed. 

The results of this paper show that individuals who search for a 
new job while working are, on average, more successful at finding a 
job than otherwise similar unemployed searchers. As was indicated 
above, there are two possible explanations for this finding. One is that 
employed search is more effective than unemployed search, perhaps 
because of better search technology (e.g., access to internal career 
ladders and better contacts) or the stigma associated with unemploy¬ 
ment. If this explanation is correct, then a typical unemployed job 
seeker would have a better chance of finding a desirable job by accept¬ 
ing the first offer received and continuing to search while employed. 
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However, such behavior may not be optimal for the individual if there 
are important differences in search costs between the employed and 
unemployed. 

The other explanation is based on unobserved heterogeneity: em¬ 
ployed searchers may simply search harder or may be better searchers 
than the unemployed in ways that are not captured by observed vari¬ 
ables. In other words, employment status may be correlated with (or 
serves as a signal of) search ability or effort. In this case, there is no 
presumption that employed search is more effective than unem¬ 
ployed search for any particular individual; unobserved differences 
across individuals are responsible for the higher job-finding rate. 

This study has identified an important phenomenon in the labor 
market that warrants an explanation. In future work, it would be 
useful to attempt to determine which of the two alternative explana¬ 
tions is empirically most important. 
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Cold Houses and Warm Climates Revisited: 
On Keeping Warm in Chicago, 
or Paradox Lost 

Donald N. Dewees and Thomas A. Wilson 

University of Toronto 


I. Introduction 

In a recent issue of this Journal, David Friedman (1987) explains his 
observation that houses in warm climates are less well heated than 
houses in cold climates by theoretical proofs that houses in cold cli¬ 
mates will be better insulated than those in warm climates, lowering 
the marginal cost of a degree of indoor temperature in the cold cli¬ 
mate and raising the optimal indoor temperature. We argue that 
there are other reasons for warm temperatures in cold climates, ren¬ 
dering meaningless a north-south test of Friedman’s hypothesis. We 
introduce survey data showing that thermostats are in fact set higher in 
warm climates. We propose tests that can more precisely discriminate 
between the remaining implications of the Friedman and the Dewees- 
Wilson hypotheses and can be performed without having to leave 
Chicago (or Toronto). 

II. Temperature and Comfort: 

Not a Constant Relationship 

Friedman assumes that comfort is determined only by air tempera¬ 
ture (p. 1090). In fact, comfort depends critically on two additional 
factors ignored by Friedman: humidity and radiation. For a given 

This research was funded by a grant from the Social Sciences and Humanities Re¬ 
search Council of Canada^ We thank David Friedman for many helpful comments 
incorporated into this paper. We remain responsible for any errors or omissions. 
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air temperature, lower relative humidity increases the rate of evap¬ 
oration of moisture, lowering skin temperature. At 70 degrees 
Fahrenheit, 30 percent humidity feels about 3 degrees cooler than 70 
percent humidity (Eshbach 1952, pp. 8-107). Furthermore, because 
warm air can hold more moisture than cold air, when outdoor air at a 
given relative humidity infiltrates a house in winter and is warmed, a 
much lower relative humidity indoors results if the outdoor tempera¬ 
ture is low than if it is moderate. Therefore, indoor humidity falls 
with outdoor temperature. In addition, the warm human body 
radiates heat to cooler surroundings. With an air temperature of 70 
degrees, standing in a room with cold walls feels colder than standing 
in the same room with warm walls because of the additional radiation 
loss from the body to the walls. The surface of the walls of an older 
house in Chicago in wintertime will be far colder than the walls of the 
same house in the summertime and than the walls of a Miami house in 
wintertime. The Chicagoan will turn up the thermostat to compensate 
for the colder walls and the lower humidity. 1 

Lower outdoor temperatures will also lead to more air movement 
(drafts) within a house of a given design. This air movement cools the 
body, which may induce individuals to raise thermostat settings fur¬ 
ther. 

III. Insulation, Efficiency, and Fuel Prices 

Friedman correctly asserts that optimal house design in cold climates 
will include more insulation than in warm climates. He goes on to 
assume that houses in fact are better insulated in warm than in cold 
climates. While this may be true for post-“energy crisis” houses in 
North America, it is less accurate for older houses. Until the early 
1960s, most new houses built in the northern United States and 
Canada contained no wall insulation and limited ceiling insulation 
(Energy, Mines and Resources 1976, pp. 35-36). Storm windows 
were often primitive or omitted. Many houses built before the energy 
crisis in these cold climates were poorly insulated and sealed. While 
storm windows, weather stripping, and attic insulation have been up¬ 
graded in much of this old stock, the walls generally remain inade¬ 
quately insulated. For these older northern houses, the marginal cost 
of a degree of indoor air temperature may be only modestly lower 
than for similar homes in the South and Southwest. More important, 
these older houses will have cold walls, dry air, and drafts in mid- 

1 While in theory a thermostat may also 1 adiate heat to outside walls, the temperature 
sensing element is usually shielded by reflective material and the unit is on an inside 
wall. This shielding causes the temperature of the sensing element to be determined by 
conduction from the air, not by radiation. 
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winter. At indoor air temperatures of 70 degrees, they will feel chilly. 
In contrast, houses built since the energy crisis have generally been 
relatively well insulated in the U.S. Northeast and Midwest and in 
Canada. A well-designed new house will use less than half as much 
energy per square foot to maintain a given temperature level in mid¬ 
winter as that required for an older house. Thus within Chicago or 
Toronto, houses of similar size but different vintage may differ by a 
factor of two or more in the marginal cost of an extra degree of 
temperature. In the South and Southwest, we expect that high air 
conditioning costs have led to improvements in insulation levels, par¬ 
ticularly where summers are hot and humid. However, we are pre¬ 
pared to agree with Friedman that new houses in the North are sub¬ 
stantially better insulated than new houses in the South. 

Other factors also cause variations in the marginal cost of a degree 
of indoor temperature. For a given energy source, heating equipment 
efficiencies may differ. A high-efficiency condensing gas furnace, 
which is more common in cold climates, uses one-third less fuel than a 
conventional gas furnace. An electric heat pump in the South may use 
less than half the electricity of direct resistance heating, while the 
same heat pump will provide a smaller advantage in the North. Fi¬ 
nally, at any given time, the price of a unit of energy from gas, pro¬ 
pane, oil, and electricity may vary by a factor of two, both between 
regions and within a region. 

Therefore, we expect the marginal cost of a degree of indoor tem¬ 
perature to vary considerably within a region according to the insula¬ 
tion and weather stripping of the house, the type of heating equip¬ 
ment, and the price of the fuel. Furthermore, we believe that these 
variations will cause the marginal cost of a degree to be higher in many 
northern houses than in many southern houses. It is not obvious that, 
on average, the marginal cost of a degree is greatly lower in northern 
than in southern houses. 

IV. The Linearity Assumption 

Friedman asserts that heating costs are proportional to the difference 
between the indoor and outdoor temperatures (p. 1096). While con¬ 
duction losses are proportional, air exchange losses may rise by as 
much as the square of the temperature difference because the volume 
of air exchanged rises with the temperature difference, 2 and the heat 
loss per unit of air is proportional to this difference. Air change losses 
account for 20-40 percent of heat losses in older houses (Energy, 
Mines and Resource*-1976, p. 35), so tljps nonlinearity is not trivial. 

2 The greater the temperature difference between indoors and outdoors, the greater 
the pressure difference causing air to enter the basement and first floor and to exit 
through the upper floors. 
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Even if wall and ceiling heat losses were linearly related to the 
difference between indoor and outdoor temperatures, heating costs 
would be nonlinear for three important types of heating: conven¬ 
tional oil and gas furnaces and electric heat pumps. For conventional 
furnaces, the most common type of household heating plant, the 
nonlinearity is caused by the combustion air required by the units. 
Usually the furnace draws in outside air to replace house air expelled 
up the chimney. The heat loss is the product of the time of operation 
of the furnace and the temperature difference between the indoor 
and outdoor air, and as a first approximation it will vary with the 
square of the temperature difference. Since heat loss up the chimney 
may account for 25 percent of the energy in the fuel for a conven¬ 
tional furnace (Dewees 1979), this nonlinearity is not trivial. For elec¬ 
tric heat pumps, as the outdoor temperature falls, the heating de¬ 
mand increases while the available heat in the outside air is reduced, 
reducing the efficiency of the heat pump. The nonlinearity involved 
here is substantial.' 1 

These nonlinearities cause the marginal cost of a degree of indoor 
temperature to increase as the outdoor temperature falls. Friedman 
recognizes that substantial nonlinearity of this sort would lead, in his 
model, to lower indoor temperatures in cold climates (p. 1096). 

V. The Demand for Comfort 

Friedman’s model must be expanded to account for the difference 
between comfort and temperature. We introduce comfort K, which 
represents a person’s perception of warmth given air temperature, 
humidity, and surface temperature of the walls. To simplify, we as¬ 
sume that humidity and radiation losses are linearly related to the 
difference between indoor and outdoor temperatures. Then utility is 

U = U(K, X), (1) 

where K is a function of both indoor and outdoor temperatures: 

K = K(T„ T„). (2) 

In the heating season, T, > T„. Because of the importance of radiation 
heat losses and lower relative humidity when outdoor temperature 
declines, we assume that dK/dT, = f(T 0 ), where f < 0. Thus the 
contribution to comfort of an additional degree of indoor tempera¬ 
ture increases when the outdoor temperature falls. We shall follow 
Friedman's assumption (contradicted by the arguments above) that 
the cost of heating is linear in the difference between indoor and 

s The efficiency (coefficient of performance) may be 3.0 at 10 degrees Celsius 2 3 at 
0 degrees, and less than 1.5 at - 15 degrees (Energy, Mines and Resources 1987). 
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outdoor temperatures. Contrary to Friedman’s model, the marginal 
cost of a unit of comfort now depends on the outdoor as well as the 
indoor temperature: 

V = U(K(T t , T tl ), X). (3) 


Utility maximization implies that 


U, dH dT; 

U 2 h ' dT, ' dK 


Pi, ■ constant ■ 


1 

f(T„) ’ 


(4) 


where H is the quantity of heat and Pi, is the price. Here, as T„ de¬ 
clines, l/f(T 0 ) decreases, so the choice of indoor temperature is not 
independent of the outdoor temperature. It is no longer true that 
identical houses will be heated equally regardless of outdoor tempera¬ 
ture. Given the utility function (3), every person would have an op¬ 
timal temperature T*. given T,„ if heat were costless. We argue that 
T* will rise as T„ falls. With costly heating, the chosen temperature, T, 
will depend on the cost of heat, the outdoor temperature T,„ the 
insulation of the house, and the elasticity of demand for comfort. If 
demand were perfectly price inelastic, T would equal T* and would 
rise as T„ falls. 

Friedman makes no explicit assumption about the elasticity of de¬ 
mand for indoor temperature or comfort, but since his model is 
driven entirely by differences in the price of a degree of temperature, 
he must assume a significant price elasticity of demand for tempera¬ 
ture. In contrast, we believe that people over a wide range of incomes 
and heating costs choose similar temperature levels. Empirical evi¬ 
dence to support this view is offered below. 


VI. Predictions and Tests 

We believe that many factors may influence the choice of indoor 
temperature during winter in cold and warm climates, including self¬ 
selection and migration, choice of clothing, social conventions, in¬ 
come, and the marginal cost of a unit of comfort, so that no useful 
general prediction can be made afebut indoor temperatures main¬ 
tained in cold and warm climates. When the outdoor temperature 
plunges, comfort falls because of the radiation heat loss and the de¬ 
cline in the relative humidity, and the marginal cost of a degree oi 
indoor temperature increases because of nonlinearities in heat loss 
and heat generation. Theory canjgpt determine which effect domi¬ 
nates. As for interregional comparisons, those factors that raise the 
marginal cost of heating in warntf regions, such as the lower level of 
insulation, tend to reduce optimal thermostat settings. The nonlinear¬ 
ity of th^marginal cost of heating works against this tendency, but the 
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comfort factors that we stress would tend to reinforce it. Again, the¬ 
ory alone cannot determine the net direction of these effects. 

Furthermore, the aggregate survey data presented in table 1 sug¬ 
gest that differences in temperature settings between cold and warm 
climates, while small, are completely contrary to Friedman’s casual 
observation. In each region, among regions, for each fuel, and for all 
fuels, thermostats are set lower in cold climates than in warm climates. 
These data lend credence to our belief that regional differences are 
multifaceted. Friedman could explain this result only by accepting 
that marginal costs of heating rise significantly as temperature falls. 

Our model, combined with the hypothesis that the demand for 
comfort is price inelastic, yields two predictions that are testable and 
differ from those of Friedman's model. First, in a given house, we 
predict that it will be rational to raise the thermostat setting by a few 
degrees when the outdoor temperature plunges, not because the mar¬ 
ginal cost of a degree has changed but because the marginal utility of 
one more degree increases as the optimal (costless) temperature rises 
above a given thermostat setting. Friedman argues that rationality 
demands no such change. Second, we predict that poorly insulated 
houses in cold climates will be maintained at a higher temperature 
than well-insulated houses in the same climates to maintain equal 
comfort. Friedman would predict the opposite. 

Both of these predictions, as well as the elasticity of demand for 
comfort, can be tested without leaving Chicago (or Toronto). First, in 
an older poorly insulated house, one could compare the thermostat 
setting in midwinter with that in the spring or fall. An increase will 
confirm the importance of focusing not just on temperature but on 
comfort. The greater the increase in the thermostat setting, the more 
inelastic is the demand for comfort. 

Second, the large differences in the marginal cost of a degree of 
temperature among different vintages of the Chicago housing stock 
would allow tests to discriminate between Friedman’s model and the 
Dewees-Wilson model. Friedman would predict higher temperatures 
in well-insulated houses than in badly insulated houses, with fuel 
costs, income, and other variables held constant. We would predict 
the opposite. A survey of thermostat settings in Chicago could test 
whether those settings are higher in old than in new houses, other 
things equal. If they are, then our inclusion of the comfort factor is 
validated, and the price elasticity of demand for comfort is small. If 
they are not, then either the price elasticity of demand for comfort is 
large or comfort depends primarily on temperature, or both. 

Finally, to test the price elasticity of demand for comfort, one could 
conduct a survey of thermostat settings of apartment dwellers who 
pay for their heat and of those whose rent includes a fixed charge for 
heating. Small differences in typical thermostat settings across these 
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two groups, other things equal, would indicate that the demand for 
comfort is price inelastic. 

The data in table 1 suggest the likely outcome of these tests of price 
elasticity. First, differences in average temperature settings both 
within and between different regions are very small. Second, while 
electricity is generally more expensive than gas (except in warmer 
areas of the South, where electric heat often implies a heat pump), 
temperature settings are lower for electricity than for gas in only 
three out of seven subregions, and then by very small amounts. 4 
Third, the temperature setting for those who do not pay for their fuel 
is on average only 0.6 degrees higher than for those who do pay for 
their fuel. Unless the comfort factors that we have identified are 
distributed nonrandomly among those who do and do not pay for 
their heat, the price elasticity of demand for temperature must be 
very low. 

VII. Conclusion 

In this comment we have questioned two assumptions underlying 
Friedman’s model: (a) that indoor comfort depends only on air tem¬ 
perature and ( b ) that the cost of heat is linear in the difference be¬ 
tween indoor and outdoor temperatures. We suggest a more complex 
model that in turn leads to quite different hypotheses. We present 
data that are inconsistent with the strong predictions of Friedman’s 
model, indicating that consumers’ choices regarding indoor tempera¬ 
tures represent the solutions of more complex problems. Further 
empirical investigation of this issue should employ disaggregate data 
so that the several factors that we have identified can be addressed 
rigorously. 
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Trade Policy and Market Structure. By Elhanan Helpman and Paul R. Krug- 

MAN. 

Cambridge, Mass.: MIT Press, 1989. Pp. xi+ 191. S24.95. 

The useful development of an economic idea depends critically on one’s 
ability to formalize it accurately and tractably. David Ricardo’s formulation of 
the theory of comparative advantage, based on the assumptions of competi¬ 
tive trade and constant returns technologies, accurately captured many of the 
salient features of the trade flows he observed and also proved capable of 
extension and refinement as new theoretical methods became available. What 
is now called the Heckscher-Ohlin theory is a natural and powerf ul develop¬ 
ment from Ricardo’s original ideas, and, as Edward Learner shows in his 1984 
monograph, it is a theory that does a good job of accounting for much of the 
observed trade flows in today’s world: Wine is still moving from Portugal to 
England, and not the other way around. 

But why are Volkswagens moving from Germany to Italy and Fiats from 
Italy to Germany? Stimulated in part by the enormous success of the Euro¬ 
pean Common Market, a number of economists have recently inquired into 
the sources of such intraindustry trade, an inquiry that naturally led back to 
Adam Smith’s ideas on specialization, increasing returns, and market size. 
Certainly his observations seem much more germane for understanding the 
role of intraindustry trade in manufactured goods than the resource-based 
theory of comparative advantage, but while Ricardo’s ideas have benefited 
from 150 years of very fruitful theoretical development, Smith's, until re¬ 
cently, had remained largely as he set them out in the Wealth uf Nations. 

It has been observed for some time that Hotelling's model of consumers 
differently located in a space of preferred product attributes provides one 
way of formalizing the benefits of specialization—in this case, of product 
variety—that permits calculation of equilibria in the presence of increasing 
returns to scale. Lancastei (1979), in particular, demonstrated the usefulness 
of this framework for thinking about trade as well as other issues. Spence 
(1976) and Dixit and Stiglitz (1977) achieved the same objective in an even 
simpler way, by dispensing with Hotelling's spatial structure and simply pos¬ 
tulating a representative consumer with a taste for variety. Either formulation 
permits a rigorous development of the theory of a monopolistically competi¬ 
tive industry, in which related but differentiated products are produced 
under conditions of increasing returns and free entry. 

The “new international economics” is a common label for the growing body 
of theoretical work in international economics that has pursued Smith’s ideas, 
using either the Hotelling-Lancaster framework or the Spence-Stiglitz-Dixit 
model to capture the benefits of specialization in a tractable way. In their 1985 
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monograph, Market Structure and Foreign Trade, Helpman and Krugman. both 
important contributors to these developments, undertook the systematic 
study ot the way these technical innovations in the theory of imperfect com¬ 
petition alter the answers to the classic questions of trade theory: Who trades 
what with whom? What are the gains from trade? 

The book is a brilliant success. Helpman and Krugman begin with a com¬ 
pact review of the Heckscher-Ohlin theory and then take the reader through 
a sequence of general equilibrium models, generally with two final goods, one 
produced under constant returns with many firms and another produced in 
an imperfectly competitive setting. They consider first the oligopolistic pro¬ 
duction of a homogeneous good and then go on to apply the Hotelling- 
Lancaster and Spence-Dixit-Stiglitz abstractions to consider imperfectly com¬ 
petitive production of differentiated products. Their method facilitates the 
systematic comparison of the consequences of difference assumptions about 
technology and the nature of competition within a common, Heckscher- 
Ohlin-like framework. One is able to see which assumptions are essential to 
which results with a clarity that is just not possible through the study of special 
cases as they appear in journal articles. 

One needs this clarity because the issues involved are so difficult. On the 
one hand, all the ways we have for thinking about imperfect competition and 
increasing returns seem to suggest new reasons why trade is beneficial (in 
addition to those arising from differences in factor endowments). More Cour¬ 
not competitors, more contestants for contestable markets, more varieties for 
variety-loving consumers: all should result from freer trade and all should 
benefit consumers. On the other hand, once one moves away from competi¬ 
tive assumptions, welfare questions involve complicated, second-best balanc¬ 
ing of different kinds of distortions, issues that cannot lie resolved by a 
straightforward application of the first welfare theorem. How do these con¬ 
siderations balance out? Well, it depends. The contribution of Helpman and 
Krugman (1985) is to set out just what it depends on and how, as well as this 


can be done given the complexity of the questions addressed. 

Market Structure and Foreign Trade is concerned with welfare issues, but in a 
way that does not make very direct contact with the current practice of com¬ 
mercial policy. In each market situation they analyze, Helpman and Krugman 
compare a country’s welfare under untaxed trade with its welfare under 
complete autarky. This comparison is well designed to bring out the main 
features of each model’s underlying structure, but not to deal with the wellare 
consequences of the most often discussed forms ot specific intervention: im¬ 
port substitution, export promotion, infant industry protection, and so on. 
The authors' new monograph, Trade Policy and Market Structure, is designed to 
deal with the consequences of specific tariff and quota policies that fall short 
of a reversion to autarky, with emphasis on the kinds of policies that are in 


wide use in the world today. 

Throughout the book, the basic unit of analysis is a single industry, with 
welfare changes measured by areas under demand and supply curves. I he 
theory of optimal tariffs under competition is reviewed in chapter 2. Subse¬ 
quent chapters introduce various forms of imperfect competition. Policy is¬ 
sues raised by intraindustry trade, here considered using the Spence-Dixit- 
Stiglitz model exclusively, are treated in chapter 7. For the most part, the 
analysis focuses on a single decision maker dealing with passive opponents, 
but strategic issues are also discussed. A very wide variety of models is care¬ 
fully analyzed in these chapters, but it is difficult to distill any genei al conclu- 
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sions. As a consumer of theory, I find myself to be more Hotelling-like, 
preferring a single ideal model (if only I knew what it was!) than having a 
taste for variety per se. This book, like its predecessor, will be taken for 
granted as shared background in future scholarly discussion of the issues it 
treats, but it will not equip this future discussion with as clear and orderly a 
framework as their first book did. 

Chapter 8 of Trade Policy and Market Structure contains brief reviews of work 
by Dixit, Baldwin and Krugman, Venables and Smith, and Harris and Cox 
that may point the way to a more fruitful application of the new international 
trade theory. These quantitative case studies go beyond a cataloging of logical 
possibilities to focus on specific models that seem to capture situations in 
particular industries, and thus they permit the exercise of judgment and the 
use of evidence to help determine which theoretically possible effects are 
small and which are critical. I suspect that as we gain more experience in such 
quantitative applications of the new trade models, a clearer picture will 
emerge about which models are most widely useful and which theoretical 
questions merit more attention. 

In view of the authors’ evident awareness of recent developments in game 
theory, I found their emphasis on the traditional, static noncooperative 
games to be curious. Why are trade theorists not also thinking in terms of 
international arrangements that might approximate first-best allocations even 
in the presence of increasing returns? The main thrust of the analysis in 
chapters 2-7 (the authors do not say this, but I shall) is that unilateral steps 
toward freer trade are generally not in a nation’s interest. It would not seem 
to me excessively utopian for theorists to think about trade in more coopera¬ 
tive terms: politicians have been doing so for years. 

Throughout Trade Policy and Market Structure, Helpman and Krugman ex¬ 
hibit what strikes a reader as extreme discomfort with the policy implications 
of the new trade theory. At one point they even protest that “this is a book 
about theory and methods, and not about policy” (p. 8), as though someone 
else had chosen the title of the book! The clearest statement of the source of 
this discomfort comes in the concluding chapter: “Is the case for free trade, so 
long a central economic tenet, now invalidated? Despite what we have said 
about the effects of trade policy we do not think so” (p. 185). 

But why don’t they think so? This is not so clear. They interpret the variety 
of examples they have reviewed as a pro-free-trade argument: “No blanket 
vindication of aggressive trade policies emerges from the analysis” (p. 186). 
But no blanket vindication of anything emerges from the analysis. In any case, 
“no sophisticated analyst ever thought that free trade was literally optimal for 
a single country. The case for free trade has always rested on an argument 
that it represents a good rule of thumb" (p. 186). Helpman and Krugman 
seem not so much to be defending the validity of what they call the “central 
economic tenet” of free trade as trying to avoid the blame for being the first to 
expose its emptiness! 

One can sympathize with this discomfort. In the United States today, pro¬ 
tectionist fantasies about the Yellow Peril are getting entirely out of hand, and 
one would surely not want to be associated with these elements or to be 
responsible for equipping them intellectually. 1 take disclaimers such as “stra¬ 
tegic ttfde policy arguments have already appeared in support of views none 
of the concept’s originators hold” (p. 8) as attempts by the authors to avoid 
such responsibility. This is certainly a defehsible personal stance, but what 
does it have to do with economic theory? 
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The fact is that introducing noncompetitive elements into the theory of 
international trade does introduce new possibilities for government interven¬ 
tion to improve welfare. Insofar as one opens the two Helpman and Krug- 
man books with a bias toward laissez-faire in international trade, this bias will 
necessarily be weakened by the analysis these books provide. I do not see why 
this should surprise anyone or why the authors should be at all apologetic 
about it. Certainly students of industrial organization, even those with a 
strong free-market bias, have long lived with the fact that specific kinds of 
government intervendon can in principle be welfare improving in monopo¬ 
lized markets. Theory does not provide us a blanket vindication of any single, 
universally applicable policy conclusion, but it does provide a coherent frame¬ 
work for examining specific policy interventions on their merits, case by case. 
I think that it is a mistake to ask more than this from economic analysis. 

Much of the new trade theory is the application of new theories of imper¬ 
fect competition to an international setting. I could not help being struck by 
the similarities between the ideological discomfort Helpman and Krugman 
are evidently experiencing and the discomfort, also induced by theories of 
imperfect competition, that Schumpeter tried to relieve in Capitalism, Social¬ 
ism, and Democracy (1950). If the standard of allocative efficiency is not service¬ 
able as a rationale for laissez-faire capitalism, what ts its economic rationale? 
Schumpeter claimed to find the answer in the theory of economic growth, 
though in my opinion we are very short on economic theory that clarifies his 
observations or permits working out their implications more completely. 

Perhaps the case for international free trade as a “good rule of thumb," to 
which Helpman and Krugman allude but which they nowhere spell out, rests 
as well on links between trade and growth. If so, it too has yet to be made. As 
was the case with Smith’s original observations on specialization and market 
size, we shall need a tractable analytical framework to make progress on this 
issue. The two Helpman and Krugman monographs have amply shown that 
when Smith’s ideas are developed within a clear theoretical framework, they 
contain some surprising implications. They are, I think, outstanding illustra¬ 
tions of why we work to construct usef ul, explicit theories rather than being 
content with good rules of thumb. 

Robert E. Lucas, Jr. 

University of Chicago 
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Theory of Industrial Organization. By Jean Tirole. 

Cambridge, Mass.: MIT Press, 1988. Pp. 479. $35.00. 

Jean Tirole has written a masterful book that should be invaluable to those 
who want a lucid presentation of the significant recent theoretical develop¬ 
ments in industrial organization. The book should be required reading for 
graduate students and professors specializing in industrial organization. 

There has been a burst of theoretical research in industrial organization, 
much of it inspired and conducted by Tirole. Once considered by theorists as 
a wasteland, industrial organization has now emerged as a place in which the 
most advanced theoretical tools can be applied. This book eloquently de¬ 
scribes these major theoretical developments. Tirole is a good writer and is at 
his best when he provides simple (or pretty simple) mathematical models to 
illustrate new ideas. His ability to capture the essential idea and strip away 
needless complexity is remarkable and reflects his sharp analytical thinking. 

This review first discusses the book's contents and then the topics not cov¬ 
ered. Next, I assess the difficulty of the book and finally suggest the courses 
for which it is most appropriate. 

I. Subjects Covered 

The book is almost entirely theoretical with a focus on the latest theoretical 
advances. An introductory chapter on the theory of the firm provides an 
excellent overview of principal-agent theory. The book is then divided into 
two parts. The first part (chaps. 1-4) covers single-firm behavior, while the 
second part (chaps. 5-11) covers multiple-firm behavior and is the distin¬ 
guishing feature of the book. Each chapter is wonderfully written. 

Chapters 1-4 consider the monopolist’s optimal choice of durability, qual¬ 
ity, prices, and control of distribution. The heuristic proof of Coase’s result 
regarding the lack of monopoly power in a durable-good industry is one ol 
the clearest 1 have seen. The discussion of quality goes through the Spence 
result on optimal product diversity and the signaling literature wheji consum¬ 
ers must infer quality from a firm's pricing. I have always been skeptical about 
the empirical importance of the signaling equilibria in which a firm signals 
quality by sinking costs; after reading Tirole's discussion, I remain skeptical. 
The discussion regarding asymmetric information and government interven¬ 
tion is all right but sometimes comes close to confusing the theoretical possi¬ 
bility of remedying an externality with the practical ability to do so. 

The discussion of price discrimination goes through first-, second-, and 
third-degree price discrimination in more detail than is typically done. The 
material in the supplementary section on optimal nonlinear tariffs is excellent 
but a bit difficult. It is hard to find this material anywhere else in such a 
readable form. For example, Tirole concisely discusses the condition on the 
distribution function of consumer types that is needed lor the optimal non¬ 
linear price to be concave—a difficult but key topic in nonlinear pricing. I 
would have spent more time on two-part tariffs and stressed more some of 
the relationships between consumer heterogeri«iky.;and the optimal two-part 
tariff. (For example, the fixed fee tends to rise ; jj*f$- marginal charge fall as 
consumers become more homogeneous.) Tirole'wie# a good job of proving 


I thank Andrew Rosenfield and George Stigler for heli 


comments. 
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the welfare properties of price discrimination. One minor quibble is that the 
section entitled "Welfare” (p. 158) talks about output, not welfare. 

Vertical relationships is the last topic in part 1. It is not clear that this 
chapter belongs in the first part ol the book on single-firm behavior since the 
point of many vertical restrictions is to improve a firm’s ability to compete 
with its rivals. Vertical relationships is a topic in which empirical examples 
abound and in which a student may need help understanding the empirical 
relevance of the models. Tirole does provide some empirical glimpses, but the 
professor will need to supplement them. Tirole’s discussion of the antitrust 
laws on vertical control is so short (one paragraph) that no student will under¬ 
stand it. Tirole discusses many of the most important reasons for vertical 
foreclosure and control but omits some. For example, the rationale for exclu¬ 
sive dealing as a way to prevent free-tiding among manufacturers is omitted. 
I irole seems overly modest in the discussion on foreclosure and stresses the 
premature nature of any conclusions. Still. I am concerned that antitrust 
enforcement agencies will read Tirole to say that foreclosure is a potentially 
serious worry that has the support of theoretical models. 1 would have pre¬ 
ferred a statement such as "Although some models of vertical foreclosure 
have been developed, most rely on strong assumptions ol asymmetry between 
incumbent and entrant. It appeals then that foreclosure is likely to succeed 
only when such asymmetry can he preserved over time. Vertical foreclosure is 
an area in which more work is needed." Some of Salop’s work on foreclosure 
and raising a rival’s costs could be profitably discussed at greater length. 

Fart 2 analyzes competition among firms. Chapter 11 (which is an appendix 
to the book) should be read before one begins part 2 since it provides the 
basics of game theory used in the remainder of the book. It starts out with 
simple concepts but quickly gets into sophisticated topics such as Bayesian 
perfect equilibrium. It is an excellent primer, but if students have never seen 
Bayes's theorem, the end of this chapter will be too hard. The supplementary 
section is clearly designed lot advanced students interested in existence 
proofs that freely utilize notions of compactness, quasi concavity, and Jensen's 
inequality. Ir» short, this chapter will he challenging even for top graduate 
students. 

Part 2 goes on to analyze static models and dynamic models of oligopoly 
and monopolistic competition. The analysis ol the standard models (Cournot 
and Bertrand) is insightful as is the discussion of monopolistic competition. 
The discussion of dynamic models treats subjects that are not presented else¬ 
where in the same detailed but readable way. 1 irole is careful to emphasize 
that, at the current stage of development, there is an “embarrassment of 
riches" in the sense that in many dynamic games almost any type of behavior 
can be justified as an equilibrium. He candidly admits that little attention has 
yet been paid to empirical implications of these models. 1 fie discussion of 
how a rival’s earlier behavior can affect others’ beliefs about the rival and 
benefit the rival is clearly expressed and is an important concept that students 
will easily grasp. The supplementary sections of several ol these chaplets 
contain material that will appeal exclusively to advanced graduate students. 

The last three chapters (chaps. 8-10) really form the heart of the book. 
These chapters apply the insights ol game theory to strategic competition. 
They contain clear and detailed discussions of topics that students might fin 
inaccessible if they relied on published articles. I irole characterizes ptecorn- 
mitments in two-period games by how they affect a rival s decisions in the 
future, and he goes quite a way in making empirically testable theoretical 
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predictions. He stresses the importance of strategic complements (e.g., in a 
Bertrand game, if one rival cuts price, the other does too) and strategic 
’ substitutes (e.g., in a Cournot game, if one rival raises quantity, the other 
lowers quantity) and explains how the object of a strategic game is to soften 
competition. He then provides a survey of excellent theoretical examples of 
how rivals can affect their reaction curves. These examples include tie-in 
sales, use of retroactive price clauses, complementarity of products with a 
rival, and choice of location. The last chapter covers patent races and time of 
technology introduction. The chapter also discusses the adoption of stan¬ 
dards and licensing. Overall, it highlights some recent and valuable new de¬ 
velopments. 

II. Subjects Not Covered 

Tirole is completely forthright in explaining that his text covers selected 
theoretical topics and is not intended as a general overview of industrial 
organization. There are two major (and intended) omissions as compared 
with a standard textbook. First, the text makes no systematic attempt to sur¬ 
vey the empirical literature that bears on the theoretical topics. Tirole can¬ 
didly admits in several places that the empirical implications of several of the 
advances in game theory have yet to be developed and tested. The omission 
of empirical discussions could leave readers in doubt about the relevance of 
the models and virtually guarantee that skeptics of the game theory approach 
will remain skeptical. For these reasons, this book is most appropriate for 
advanced students who have already been exposed to some of the evidence. 
Tirole recommends reliance on other texts for empirical evidence, but that 
approach may be a bit difficult since other texts may not be organized in the 
same way as Tirole’s and may not discuss the same models. 

The second major omission is the lack of any systematic treatment of regu¬ 
lation and antitrust. It is in keeping with the theoretical focus of the book to 
omit antitrust developments, but it would have bieen consistent to have a 
chapter on the recent theoretical advances in the theory of regulation. In¬ 
deed, the recent developments regarding regulation with asymmetric infor¬ 
mation would fit in nicely with the excellent introductory chapter on the 
theory of the firm. The theory of natural monopoly and sustainability would 
also fit in nicely with the rest of the book. The profession would benefit if 
Tirole had written such a chapter given his remarkable ability to explain 
complicated ideas in a simple fashion. 

III. The Level of Difficulty and.Pedagogical Features 

The level of difficulty of the book raises some questions about whether under¬ 
graduates could handle the material. Tirole excels at simplifying and present¬ 
ing simple mathematical models. Overall, he does a terrific job of minimizing 
reliance on complicated mathematical techniques. Even so, undergraduates 
could find the mathematics tough or intimidating. Line integrals are dis¬ 
cussed in the preface, control theory is needed for a problem in chapter 1, 
and game theory is used extensively (after being presented in a clear but 
difficult appendix) later in the book. None of these techniques is so com¬ 
plicated that a mathematically sophisticated undergraduate should be put off, 
but I suspect that most undergraduates will be at least a little nervous. 

The chapters of the book vary in their level of difficulty. The book does not 
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progress from simple to difficult. The first chapters on principahagent theory 
and the Coase problem of durable goods seem more difficult than those at the 
end of the book. In particular, the chapters on strategic competition are not, 
for the most part, mathematically demanding. 

Each chapter contains a supplementary section for advanced students, lots 
of well-thought-out exercises within the chapter and at the back of the book, 
and worked-out solutions. The text is a delight to teach from because the 
exercises provide a great way to illustrate ideas that Tirole mentions but does 
not have time to treat. The instructor will never have to worry about questions 
for problem sets or final exams. The difficulty of the exercises is indicated 
(three levels), a helpful pedagogical idea. The only pedagogical improvement 
I could suggest is the addition of an author index. It is hard to guess in which 
chapter an article falls, and an index of this type would help. 

IV. What Courses Can Use This Book? 

I would use this book as a sole text only in a specialized, advanced graduate 
course in industrial organization because of its idiosyncratic coverage of top¬ 
ics, but I would use parts of the book to complement other readings in a more 
general graduate course. I have taught graduate industrial organization with 
Robert Gertner at the University of Chicago for several years and have used 
the Carlton-Perloff (1990) textbook (which is slightly less advanced than 
Tirole’s, especially on dynamic games under uncertainty, but deals with em¬ 
pirical and theoretical topics) and selected chapters of Tirole plus journal 
articles. In particular, Gertner and I use several of Ttrole’s chapters to discuss 
strategic behavior (chaps. 8-10) and to introduce game theory (chap. 11). I 
would feel comfortable using parts of selected chapters in an advanced 
undergraduate course. Again, parts of chapters 8—10, the heart of the book, 
would he the most appropriate, not too hard, and most valuable. 

In summary, this book takes as its objective the presentation of several of 
the latest theoretical advances in industrial organization. It succeeds admira¬ 
bly in achieving its objective. Tirole is a gifted writer who excels at explaining 
the essence of complicated models. 

Dennis W. Carlton 

University of Chicago 
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